LFX Platform

Know more about LFX Platform

LFX Insights

Workflow Orchestrators

Tools for designing, scheduling, and monitoring data pipelines.

44 projects

68,114 contributors

$968M

Argo

Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes.

Contributors

22,849

Organizations

5,389

Software value

$138M

Apache Airflow

Apache Airflow is an open-source platform for programmatically authoring, scheduling, and monitoring workflows. It allows users to create data pipelines as directed acyclic graphs (DAGs) of tasks, enabling complex orchestration of batch processes and data processing workflows.

Contributors

17,666

Organizations

2,480

Software value

$50M

Kubeflow

Kubeflow is an open source machine learning platform built on Kubernetes that makes deploying and managing ML workflows on Kubernetes simple, portable and scalable. It provides end-to-end orchestration of machine learning pipelines, model training, serving, and experiment tracking.

Contributors

10,290

Organizations

2,271

Software value

$412M

Dagster

Dagster is an open-source data orchestration framework that lets you define, test, and orchestrate data pipelines using Python code. It provides tools for building, testing, and monitoring data workflows while emphasizing software engineering best practices like modularity, testability, and gradual typing.

Contributors

3,900

Organizations

699

Software value

$78M

Prefect

Prefect is a workflow orchestration platform that enables users to build, schedule, and monitor data pipelines and machine learning workflows. It provides a Python-based framework for creating resilient, distributed workflows with features like automatic retries, caching, and real-time monitoring.

Contributors

3,767

Organizations

744

Software value

$29M

Apache NiFi

Apache NiFi is an enterprise data flow management and automation platform that enables organizations to reliably process, route, transform and distribute data between diverse systems. It provides a web-based interface for designing, controlling and monitoring data flows, with features for data provenance, security, extensibility and real-time control.

Contributors

1,863

Organizations

221

Software value

$44M

Flyte

Flyte is a container-native, type-safe workflow and pipelines platform optimized for large scale processing and machine learning written in Golang.

Contributors

1,850

Organizations

350

Software value

$80M

Kedro Project

The mission of the Project is to design and implement an open source framework for creating reproducible, maintainable and modular data science code.

Contributors

1,693

Organizations

198

Software value

$52M

NIPYPE

NIPYPE is a Python-based neuroimaging data processing framework that provides a uniform interface to existing neuroimaging software and facilitates interaction between these packages within a single workflow. It enables reproducible, distributed analysis of neuroimaging data through workflows and interfaces to commonly used neuroimaging tools.

Contributors

1,039

Organizations

193

Software value

$7.2M

Windmill

Windmill is an open-source developer platform for building internal tools and workflows. It provides a low-code solution for creating backend scripts, APIs, and UIs with features like resource management, scheduling, and version control. The platform enables developers to write scripts in multiple languages and automate business processes.

Contributors

893

Organizations

235

Software value

$23M

nf-core/rnaseq

A bioinformatics pipeline for RNA sequencing analysis that performs quality control, alignment, quantification and extensive quality control on RNA sequencing data

Contributors

652

Organizations

105

Software value

$1.1M

Taipy

Turns Data and AI algorithms into production-ready web applications in no time.

Contributors

592

Organizations

67

Software value

$4.9M

Meltano

Meltano is an open source ELT (Extract, Load, Transform) platform that helps organizations integrate and manage their data pipelines. It provides a command-line interface and web UI for orchestrating data workflows, managing configurations, and connecting various data tools and services.

Contributors

420

Organizations

82

Software value

$3.5M

Global Workflow

A comprehensive workflow system for NOAA's global numerical weather prediction models, providing end-to-end support for model initialization, execution, post-processing, and product generation for operational forecasting

Contributors

348

Organizations

9

Software value

$3.1M

Tremor

Tremor is an early stage event processing system for unstructured data with rich support for structural pattern matching, filtering and transformation.

Contributors

178

Organizations

86

Software value

$13M

Pegasus WMS

Pegasus WMS is a workflow management system that automates the execution of complex computational workflows across distributed computing resources. It transforms abstract workflow descriptions into concrete execution plans, handles data management, job scheduling, and fault tolerance for scientific applications.

Contributors

80

Organizations

11

Software value

$24M

OpenFIDO

Open Framework for Integrated Data Operations (OpenFIDO) is a data and model processing framework funded by the California Energy Commissions (EPC 17-047).

Contributors

34

Organizations

13

Software value

$3.5M

Astronomer Dbt-Airflow Integration

Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code

This project hasn't been onboarded to LFX Insights.

Astronomer Helm Charts

Helm Charts for the Astronomer Platform, Apache Airflow as a Service on Kubernetes

This project hasn't been onboarded to LFX Insights.

Brooklyn атты

experiment orchestration and data acquisition

This project hasn't been onboarded to LFX Insights.

CDAP

An open source framework for building data analytic applications.

This project hasn't been onboarded to LFX Insights.

Cumulus

Cumulus Framework + Cumulus API

This project hasn't been onboarded to LFX Insights.

DolphinScheduler

Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code

This project hasn't been onboarded to LFX Insights.

GalaxyProject: Data Science for Everyone

Data intensive science for everyone.

This project hasn't been onboarded to LFX Insights.

Hop

Hop Orchestration Platform

This project hasn't been onboarded to LFX Insights.

Instill Core

🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications

This project hasn't been onboarded to LFX Insights.

Kestra

:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...

This project hasn't been onboarded to LFX Insights.

Linkis

Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.

This project hasn't been onboarded to LFX Insights.

Mage-AI Data Pipeline Platform by Mage-ai

🧙 Build, run, and manage data pipelines for integrating and transforming data.

This project hasn't been onboarded to LFX Insights.

Metaflow

Open Source AI/ML Platform

This project hasn't been onboarded to LFX Insights.

Nextflow

A DSL for data-driven computational pipelines

This project hasn't been onboarded to LFX Insights.

QUACC

quacc is a flexible platform for computational materials science and quantum chemistry that is built for the big data era.

This project hasn't been onboarded to LFX Insights.

Single-cell RNA-seq Pipeline

Single-cell RNA-Seq pipeline for barcode-based protocols such as 10x, DropSeq or SmartSeq, offering a variety of aligners and empty-droplet detection

This project hasn't been onboarded to LFX Insights.

Snakemake

This is the development home of the workflow management system Snakemake. For general information, see

This project hasn't been onboarded to LFX Insights.

Spring Cloud Data Flow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes

This project hasn't been onboarded to LFX Insights.

Texera

Collaborative Machine-Learning-Centric Data Analytics Using Workflows

This project hasn't been onboarded to LFX Insights.

VAST

Tenzir is the data pipeline engine for security teams.

This project hasn't been onboarded to LFX Insights.

ZenML

ZenML 🙏: The bridge between ML and Ops. https://zenml.io.

This project hasn't been onboarded to LFX Insights.

ampliseq

Amplicon sequencing analysis workflow using DADA2 and QIIME2

This project hasn't been onboarded to LFX Insights.

warp

WDL Analysis Research Pipelines

This project hasn't been onboarded to LFX Insights.
Looking for a project that’s not listed?