24 projects
Cube
📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
2,430
433
$24M
Amundsen
Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data. Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
1,030
249
$8.8M
Atlas
Manage your database schema as code
909
300
$3.8M
Experience Data Model (XDM)
Experience Data Model (XDM) is Adobe's standardized framework for customer experience data, providing common structures and definitions for digital experiences across Adobe Experience Platform and other systems. It enables consistent data representation and interoperability between applications and services.
709
20
$6.7M
Open Lineage
The mission of the Project is to enable the industry at-large to collect lineage metadata consistently and comprehensively across complex pipelines, creating a deeper understanding of data.
668
102
$10M
GeoNetwork
GeoNetwork is an open-source catalog application for managing spatially referenced resources. It provides powerful metadata editing and search functions, an embedded interactive web map viewer, and is designed to enable access to geo-referenced databases and cartographic products from a variety of data providers through descriptive metadata.
643
98
$50M
HDX: Humanitarian Data Exchange
A repo for HDX's configurations and extensions to CKAN
577
81
$22M
Eclipse Dataspace Connector
The Eclipse Dataspace Connector (EDC) is an open-source framework for creating enterprise-grade data spaces that enable secure and standardized data sharing between organizations. It implements the International Data Spaces (IDS) standard and provides core capabilities for data sovereignty, contract negotiation, and policy enforcement in distributed data ecosystems.
500
64
$5.7M
Marquez
Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata. It maintains the provenance of how datasets are consumed and produced, provides global visibility into job runtime and frequency of dataset access, centralization of dataset lifecycle management, and much more. Marquez was released and open sourced by WeWork.
496
61
$3M
Rucio
Rucio - Scientific Data Management
488
93
$5.9M
Egeria
Egeria provides the Apache 2.0 licensed open metadata and governance type system, frameworks, APIs, event payloads and interchange protocols to enable tools, engines and platforms to exchange metadata in order to get the best value from data whilst ensuring it is properly governed.
456
51
$48M
Project Nessie
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
307
57
$9.2M
Fides
The Privacy Engineering & Compliance Framework
247
22
$20M
FAIRDOM-SEEK
For finding, sharing and exchanging Data, Models, Simulations and Processes in Science.
224
50
$38M
Standard Energy Efficiency Data (SEED) Platform
The Standard Energy Efficiency Data (SEED) Platform is an open source software application that helps organizations easily manage and analyze building performance data. It enables users to import, clean, and analyze portfolios of building energy data, track building characteristics and energy performance metrics, and generate reports for compliance with building energy laws.
158
17
$16M
Stroom
Stroom is a highly scalable data storage, processing and analysis platform.
68
8
$50M
Project Alvarium
Alvarium will foster a community to collaborate on the baseline open source framework and related APIs that bind together the various ingredients that constitute trust fabrics, as well as to define the algorithms that drive confidence scores as data flows through any given implementation.
45
10
$1.3M
CKAN
CKAN, the world’s leading Open Source data portal platform, is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data.
23
1
OpenBytes
The mission of the Project is to facilitate wider sharing of, and collaboration with, data in the AI community through the creation of data standards and formats and enabling contributions of data.
20
4
$305K
Artigraph
The mission of the Project is to enable data lifecycle management for AI workflow tooling and other applications.
15
8
$274K
Project: OpenMetadata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.