LFX Platform

Know more about LFX Platform

LFX Insights

ML Libraries & Toolkits

Specialized libraries and tools for specific machine learning tasks and algorithms.

88 projects

218,780 contributors

$1.8B

TensorFlow

TensorFlow is an open-source machine learning framework developed by Google that enables numerical computation and large-scale machine learning. It provides a flexible system for defining and executing computations involving tensors, which are multi-dimensional arrays. The framework supports deep learning and neural networks across multiple platforms and devices.

Contributors

47,270

Organizations

6,157

Software value

$199M

vLLM

The mission of the Project is to develop an open-source library for fast LLM inference and serving.

Contributors

22,809

Organizations

2,436

Software value

$75M

scikit-learn

scikit-learn: machine learning in Python

Contributors

15,327

Organizations

2,623

Software value

$15M

Ultralytics YOLO

Ultralytics YOLO11 🚀

Contributors

12,549

Organizations

816

Software value

$4M

ONNX

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

Contributors

8,030

Organizations

996

Software value

$51M

LLaMA Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Contributors

7,282

Organizations

430

Software value

$3M

Caffe

Caffe: a fast open framework for deep learning.

Contributors

7,098

Organizations

1,010

Software value

$3.5M

spaCy

spaCy is an industrial-strength natural language processing library for Python, designed for production use. It offers fast and accurate syntactic analysis, named entity recognition, text classification, and more. The library includes pre-trained statistical models and word vectors, and supports deep learning integration.

Contributors

6,491

Organizations

1,128

Software value

$7.8M

JAX

JAX is a high-performance numerical computing and machine learning library that combines Numpy's familiar API with GPU and TPU hardware acceleration. It features automatic differentiation, just-in-time compilation, and enables writing transformable numerical programs.

Contributors

6,189

Organizations

1,133

Software value

$20M

XGBoost

XGBoost is a scalable, distributed gradient boosting library that provides parallel tree boosting for machine learning tasks. It implements machine learning algorithms under the gradient boosting framework, offering high performance, flexibility and portability across multiple programming languages and platforms.

Contributors

5,922

Organizations

832

Software value

$6.4M

PyTorch Geometric

Graph Neural Network Library for PyTorch

Contributors

5,061

Organizations

719

Software value

$5.3M

Gradio

Gradio is an open-source Python library that enables developers to quickly create customizable web interfaces for machine learning models, data processing pipelines, and other Python functions. It allows for easy demo creation and sharing of ML models with drag-and-drop interfaces, requiring minimal code.

Contributors

4,512

Organizations

702

Software value

$10M

Detectron2

Detectron2 is a computer vision library developed by Facebook AI Research (FAIR) that implements state-of-the-art object detection algorithms. It provides a modular, flexible platform for implementing and training computer vision models, with support for tasks like object detection, instance segmentation, keypoint detection, and panoptic segmentation.

Contributors

4,298

Organizations

603

Software value

$2.3M

Llama Models

A collection of large language models (LLMs) developed by Meta AI, including the Llama family of models. These models are designed for natural language processing tasks and are made available for research and commercial use under specific licensing terms.

Contributors

4,004

Organizations

670

Software value

$425K

CatBoost

CatBoost is a high-performance, open-source gradient boosting library developed by Yandex that implements gradient boosting on decision trees. It provides fast, scalable, and accurate machine learning algorithms for classification, regression, and ranking tasks, with built-in support for categorical features.

Contributors

3,552

Organizations

345

Software value

$242M

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Contributors

3,509

Organizations

311

Software value

$28M

LightGBM

LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed and efficient with faster training speed and higher efficiency, lower memory usage, better accuracy, parallel and GPU learning, and handling large-scale data.

Contributors

3,164

Organizations

483

Software value

$3.4M

Gensim

Topic Modelling for Humans

Contributors

2,811

Organizations

552

Software value

$8.4M

Faiss

Faiss is a library for efficient similarity search and clustering of dense vectors, developed by Facebook Research. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also includes support for different similarity metrics and various optimization methods for fast and accurate vector search.

Contributors

2,735

Organizations

497

Software value

$5.1M

DeepRec

The mission of the Project is to develop a high-performance recommendation deep learning framework.

Contributors

2,726

Organizations

283

Software value

$160M

Natural Language Toolkit (NLTK)

NLTK (Natural Language Toolkit) is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

Contributors

2,700

Organizations

621

Software value

$4.1M

Sentence Transformers

Sentence Transformers is a Python framework for state-of-the-art sentence and text embeddings. It provides easy-to-use methods to compute dense vector representations for sentences, paragraphs and images, enabling semantic similarity comparisons and information retrieval tasks.

Contributors

2,505

Organizations

459

Software value

$2.4M

Flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Contributors

2,180

Organizations

353

Software value

$2M

Whisper

Whisper is an automatic speech recognition (ASR) system developed by OpenAI that can transcribe and translate spoken language from audio into text. It is trained on a large dataset of multilingual speech data and can handle various languages, accents, and acoustic environments.

Contributors

2,166

Organizations

295

Software value

$606K

Kaldi Speech Recognition Toolkit

kaldi-asr/kaldi is the official location of the Kaldi project.

Contributors

2,044

Organizations

254

Software value

$27M

Optuna

A hyperparameter optimization framework

Contributors

1,902

Organizations

329

Software value

$1.8M

FATE Project

FATE is an open-source project initiated by Webank’s AI Department to provide a secure computing framework to support the federated AI ecosystem.

Contributors

1,639

Organizations

105

Software value

$103M

Tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Contributors

1,624

Organizations

426

Software value

$1.6M

GGML

GGML is a tensor library for machine learning that enables efficient neural network inference on CPU. It provides low-level primitives for implementing deep learning models with a focus on performance and memory efficiency, particularly for running large language models on consumer hardware.

Contributors

1,532

Organizations

274

Software value

$7.9M

Stable Baselines3 (SB3)

Stable Baselines3 (SB3) is a reliable implementation of reinforcement learning algorithms in PyTorch. It provides a set of high-quality implementations of state-of-the-art reinforcement learning algorithms, including PPO, A2C, DQN, and SAC. The library focuses on providing clean, documented, and reliable implementations for research and development in reinforcement learning.

Contributors

1,460

Organizations

219

Software value

$748K

TorchMetrics

Machine learning metrics for distributed, scalable PyTorch applications.

Contributors

1,424

Organizations

354

Software value

$3.2M

SpeechBrain

SpeechBrain is an open-source speech toolkit built on PyTorch that provides state-of-the-art speech technologies, including speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and speech separation. It features a unified, flexible interface for speech research and applications.

Contributors

1,414

Organizations

176

Software value

$9.4M

mlpack

mlpack is a fast, flexible machine learning library written in C++ that aims to provide fast, extensible implementations of cutting-edge machine learning algorithms. It offers bindings for multiple languages and emphasizes scalability, speed, and ease-of-use through both command-line and C++ interfaces.

Contributors

1,393

Organizations

234

Software value

$13M

UMAP

UMAP (Uniform Manifold Approximation and Projection) is a dimension reduction technique for machine learning and data visualization. It helps visualize high-dimensional data by finding a low-dimensional representation that preserves the essential structure of the data. The algorithm is particularly effective at preserving both local and global structure in the data while being computationally efficient.

Contributors

1,364

Organizations

288

Software value

$4.9M

NVIDIA DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Contributors

1,260

Organizations

220

Software value

$14M

cuML

cuML - RAPIDS Machine Learning Library

Contributors

1,097

Organizations

172

Software value

$7.1M

Chainer

A flexible framework of neural networks for deep learning

Contributors

1,021

Organizations

210

Software value

$7.7M

Ludwig

Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system. Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks.

Contributors

938

Organizations

152

Software value

$14M

PyTensor

PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

Contributors

903

Organizations

126

Software value

$7.5M

Nilearn

Machine learning for NeuroImaging in Python

Contributors

900

Organizations

182

Software value

$5.7M

PennyLane

PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.

Contributors

828

Organizations

112

Software value

$23M

GPy

Gaussian processes framework in python

Contributors

769

Organizations

141

Software value

$1.8M

Adversarial Robustness Toolbox

Adversarial Robustness Toolbox (ART) provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats.

Contributors

709

Organizations

66

Software value

$6.9M

Zygote

Zygote is a source-to-source automatic differentiation (AD) system for the Julia programming language. It enables efficient gradient computation for machine learning and scientific computing applications by directly transforming Julia source code.

Contributors

646

Organizations

228

Software value

$278K

Turing.jl

Bayesian inference with probabilistic programming.

Contributors

605

Organizations

154

Software value

$284K

gnina

A deep learning framework for molecular docking

Contributors

544

Organizations

74

Software value

$4.3M

Recommenders

The mission of the Project is to develop examples and best practices for building recommendation systems, provided as Jupyter notebooks.

Contributors

515

Organizations

109

Software value

$3M

ilastik

ilastik-shell, applets, and workflows to string them together.

Contributors

515

Organizations

76

Software value

$5.8M

Elyra

The mission of the Project is to create and maintain an open-source development workspace that simplifies the creation and orchestration of the AI model development lifecycle tasks.

Contributors

486

Organizations

94

Software value

$23M

PySR

PySR is a symbolic regression tool that uses genetic programming to discover mathematical expressions from data. It can find simple, interpretable equations from complex datasets by optimizing both accuracy and simplicity, making it useful for scientific discovery and machine learning applications.

Contributors

472

Organizations

66

Software value

$415K

Neural Network (NN) Streamer

? Neural Network (NN) Streamer, Stream Processing Paradigm for Neural Network Apps/Devices.

Contributors

455

Organizations

50

Software value

$34M

XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

Contributors

424

Organizations

93

Software value

$105M

bugbug

Bugbug is a machine learning tool developed by Mozilla that helps automate bug management and classification in software development. It uses historical bug data to train models that can predict bug characteristics, severity, and component assignments, helping streamline the bug triage process.

Contributors

411

Organizations

89

Software value

$2.1M

FedML

FedML is an open-source library and MLOps platform for federated learning research and production deployment. It provides a unified framework for distributed/federated training, serving, and mobile/IoT deployment with consistent APIs and modular architecture.

Contributors

402

Organizations

57

Software value

$7.9M

Enzyme Automatic Differentiator

Enzyme is an automatic differentiation tool that performs reverse-mode AD by using LLVM compiler infrastructure to differentiate programs in languages like Julia, C/C++, and Fortran. It enables efficient gradient computation for machine learning and scientific computing applications.

Contributors

368

Organizations

130

Software value

$1.8M

NetKet

Machine learning algorithms for many-body quantum systems

Contributors

316

Organizations

44

Software value

$2.9M

Quarkus LangChain4j

Quarkus LangChain4j is an integration project that enables the use of LangChain4j, a Java framework for building applications with large language models (LLMs), within Quarkus applications. It provides Quarkus extensions and configurations for working with AI/ML models and language processing capabilities.

Contributors

312

Organizations

64

Software value

$6.8M

DocArray

The mission of the DocArray project is to develop a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh.

Contributors

309

Organizations

46

Software value

$439M

OpenFL

The mission of the OpenFL projet is to build a flexible, secure, scalable and easily learnable Federated Learning tool for data scientists and data owners.

Contributors

308

Organizations

42

Software value

$2.9M

Lux

Elegant and Performant Scientific Machine Learning in Julia

Contributors

230

Organizations

77

Software value

$1.7M

Looking for a project that’s not listed?