LFX Platform

Know more about LFX Platform

LFX Insights

Scientific Data Formats

Libraries and tools for storing, accessing, and managing large scientific datasets with hierarchical structures, supporting high-performance I/O operations and complex data models across multiple programming languages.

5 projects

3,860 contributors

$134M

h5py

h5py is a Pythonic interface to the HDF5 binary data format, providing a high-level interface for storing and organizing large amounts of numerical data. It allows Python programs to store huge amounts of numerical data in a hierarchical format, and efficiently manipulate that data from NumPy.

Contributors

1,540

Organizations

476

Software value

$1.1M

Zarr

Zarr is a format and library for chunked, compressed N-dimensional arrays, designed for efficient storage and access of large scientific datasets. It provides a Python implementation with support for cloud storage, parallel computing, and hierarchical organization of arrays.

Contributors

825

Organizations

292

Software value

$1.5M

Unidata NetCDF

NetCDF (Network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This C implementation provides the reference library and tools for working with NetCDF files, enabling efficient storage and retrieval of multi-dimensional scientific data.

Contributors

639

Organizations

171

Software value

$20M

HDF5

Official HDF5® Library Repository

Contributors

636

Organizations

180

Software value

$98M

TileDB

TileDB is a universal data engine that enables efficient storage, querying and management of multi-dimensional array data. It provides a novel storage format and APIs for handling dense and sparse arrays with support for multiple data types, cloud storage integration, and parallel I/O operations.

Contributors

220

Organizations

63

Software value

$14M

Looking for a project that’s not listed?