LFX Platform

Know more about LFX Platform

LFX Insights

Data Catalogs & Lineage

Systems for documenting and tracking data assets and their relationships.

24 projects

10,013 contributors

$323M

Cube

📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics

Contributors

2,430

Organizations

433

Software value

$24M

Amundsen

Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data. Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Contributors

1,030

Organizations

249

Software value

$8.8M

Atlas

Manage your database schema as code

Contributors

909

Organizations

300

Software value

$3.8M

Experience Data Model (XDM)

Experience Data Model (XDM) is Adobe's standardized framework for customer experience data, providing common structures and definitions for digital experiences across Adobe Experience Platform and other systems. It enables consistent data representation and interoperability between applications and services.

Contributors

709

Organizations

20

Software value

$6.7M

Open Lineage

The mission of the Project is to enable the industry at-large to collect lineage metadata consistently and comprehensively across complex pipelines, creating a deeper understanding of data.

Contributors

668

Organizations

102

Software value

$10M

GeoNetwork

GeoNetwork is an open-source catalog application for managing spatially referenced resources. It provides powerful metadata editing and search functions, an embedded interactive web map viewer, and is designed to enable access to geo-referenced databases and cartographic products from a variety of data providers through descriptive metadata.

Contributors

643

Organizations

98

Software value

$50M

HDX: Humanitarian Data Exchange

A repo for HDX's configurations and extensions to CKAN

Contributors

577

Organizations

81

Software value

$22M

Eclipse Dataspace Connector

The Eclipse Dataspace Connector (EDC) is an open-source framework for creating enterprise-grade data spaces that enable secure and standardized data sharing between organizations. It implements the International Data Spaces (IDS) standard and provides core capabilities for data sovereignty, contract negotiation, and policy enforcement in distributed data ecosystems.

Contributors

500

Organizations

64

Software value

$5.7M

Marquez

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata. It maintains the provenance of how datasets are consumed and produced, provides global visibility into job runtime and frequency of dataset access, centralization of dataset lifecycle management, and much more. Marquez was released and open sourced by WeWork.

Contributors

496

Organizations

61

Software value

$3M

Rucio

Rucio - Scientific Data Management

Contributors

488

Organizations

93

Software value

$5.9M

Egeria

Egeria provides the Apache 2.0 licensed open metadata and governance type system, frameworks, APIs, event payloads and interchange protocols to enable tools, engines and platforms to exchange metadata in order to get the best value from data whilst ensuring it is properly governed.

Contributors

456

Organizations

51

Software value

$48M

Project Nessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

Contributors

307

Organizations

57

Software value

$9.2M

Fides

The Privacy Engineering & Compliance Framework

Contributors

247

Organizations

22

Software value

$20M

FAIRDOM-SEEK

For finding, sharing and exchanging Data, Models, Simulations and Processes in Science.

Contributors

224

Organizations

50

Software value

$38M

Standard Energy Efficiency Data (SEED) Platform

The Standard Energy Efficiency Data (SEED) Platform is an open source software application that helps organizations easily manage and analyze building performance data. It enables users to import, clean, and analyze portfolios of building energy data, track building characteristics and energy performance metrics, and generate reports for compliance with building energy laws.

Contributors

158

Organizations

17

Software value

$16M

Stroom

Stroom is a highly scalable data storage, processing and analysis platform.

Contributors

68

Organizations

8

Software value

$50M

Project Alvarium

Alvarium will foster a community to collaborate on the baseline open source framework and related APIs that bind together the various ingredients that constitute trust fabrics, as well as to define the algorithms that drive confidence scores as data flows through any given implementation.

Contributors

45

Organizations

10

Software value

$1.3M

CKAN

CKAN, the world’s leading Open Source data portal platform, is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data.

Contributors

23

Organizations

1

Archived

OpenBytes

The mission of the Project is to facilitate wider sharing of, and collaboration with, data in the AI community through the creation of data standards and formats and enabling contributions of data.

Contributors

20

Organizations

4

Software value

$305K

Artigraph

The mission of the Project is to enable data lifecycle management for AI workflow tooling and other applications.

Contributors

15

Organizations

8

Software value

$274K

Project: OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

This project hasn't been onboarded to LFX Insights.
Looking for a project that’s not listed?