LFX Platform

Know more about LFX Platform

LFX Insights

Data Quality & Preparation

Tools for cleaning, validating, and standardizing data for analysis.

10 projects

6,307 contributors

$167M

Great Expectations

Always know what to expect from your data.

Contributors

2,411

Organizations

301

Software value

$46M

OpenRefine

OpenRefine is a powerful open source tool for working with messy data, cleaning it, transforming it from one format into another, and extending it with web services and external data. It allows users to explore large data sets, fix inconsistencies, reconcile and match data to databases like Wikidata, and transform data into different formats for further use.

Contributors

1,656

Organizations

311

Software value

$22M

Elasticsearch Curator

Elasticsearch Curator is a maintenance and management tool for Elasticsearch indices that helps users perform administrative tasks like creating, deleting, and managing indices based on time-based patterns, size thresholds, and other configurable criteria.

Contributors

1,265

Organizations

376

Software value

$1M

lakeFS

lakeFS - Data version control for your data lake | Git for data

Contributors

582

Organizations

124

Software value

$14M

Standard Energy Efficiency Data (SEED) Platform

The Standard Energy Efficiency Data (SEED) Platform is an open source software application that helps organizations easily manage and analyze building performance data. It enables users to import, clean, and analyze portfolios of building energy data, track building characteristics and energy performance metrics, and generate reports for compliance with building energy laws.

Contributors

158

Organizations

18

Software value

$16M

Tapdata

Tapdata Live Data Platform Project

Contributors

151

Organizations

9

Software value

$18M

Stroom

Stroom is a highly scalable data storage, processing and analysis platform.

Contributors

69

Organizations

8

Software value

$50M

Artigraph

The mission of the Project is to enable data lifecycle management for AI workflow tooling and other applications.

Contributors

15

Organizations

7

Software value

$274K

Frictionless Data

Data Package is a standard consisting of a set of simple yet extensible specifications to describe datasets, data files and tabular data. It is a data definition language (DDL) and data API that facilitates findability, accessibility, interoperability, and reusability (FAIR) of data.

This project hasn't been onboarded to LFX Insights.

neosync

Open source data anonymization and synthetic data platform for developers. Anonymize your production data and sync it across your environments so that developers can safely use it.

This project hasn't been onboarded to LFX Insights.
Looking for a project that’s not listed?