Meet us in Atlanta for KubeCon + CloudNativeCon North America · Nov 10-13 REGISTER TODAY

Search projects, repositories...

⇧+K

ML Libraries & Toolkits

Specialized libraries and tools for specific machine learning tasks and algorithms.

84 projects

128,050 contributors

$844M

Most contributors

TensorFlow

TensorFlow is an open-source machine learning framework developed by Google that enables numerical computation and large-scale machine learning. It provides a flexible system for defining and executing computations involving tensors, which are multi-dimensional arrays. The framework supports deep learning and neural networks across multiple platforms and devices.

Contributors

47,080

Organizations

6,189

Software value

$196M

vLLM

The mission of the Project is to develop an open-source library for fast LLM inference and serving.

Contributors

18,859

Organizations

2,211

Software value

$24M

ONNX

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.

Contributors

7,962

Organizations

1,000

Software value

$47M

spaCy

spaCy is an industrial-strength natural language processing library for Python, designed for production use. It offers fast and accurate syntactic analysis, named entity recognition, text classification, and more. The library includes pre-trained statistical models and word vectors, and supports deep learning integration.

Contributors

6,466

Organizations

1,130

Software value

$7.8M

JAX

JAX is a high-performance numerical computing and machine learning library that combines Numpy's familiar API with GPU and TPU hardware acceleration. It features automatic differentiation, just-in-time compilation, and enables writing transformable numerical programs.

Contributors

6,035

Organizations

1,109

Software value

$20M

XGBoost

XGBoost is a scalable, distributed gradient boosting library that provides parallel tree boosting for machine learning tasks. It implements machine learning algorithms under the gradient boosting framework, offering high performance, flexibility and portability across multiple programming languages and platforms.

Contributors

5,907

Organizations

828

Software value

$6.3M

Gradio

Gradio is an open-source Python library that enables developers to quickly create customizable web interfaces for machine learning models, data processing pipelines, and other Python functions. It allows for easy demo creation and sharing of ML models with drag-and-drop interfaces, requiring minimal code.

Contributors

4,414

Organizations

696

Software value

$9.4M

Detectron2

Detectron2 is a computer vision library developed by Facebook AI Research (FAIR) that implements state-of-the-art object detection algorithms. It provides a modular, flexible platform for implementing and training computer vision models, with support for tasks like object detection, instance segmentation, keypoint detection, and panoptic segmentation.

Contributors

4,292

Organizations

606

Software value

$2.3M

Llama Models

A collection of large language models (LLMs) developed by Meta AI, including the Llama family of models. These models are designed for natural language processing tasks and are made available for research and commercial use under specific licensing terms.

Contributors

3,952

Organizations

676

Software value

$425K

CatBoost

CatBoost is a high-performance, open-source gradient boosting library developed by Yandex that implements gradient boosting on decision trees. It provides fast, scalable, and accurate machine learning algorithms for classification, regression, and ranking tasks, with built-in support for categorical features.

Contributors

3,521

Organizations

348

Software value

$236M

LightGBM

LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed and efficient with faster training speed and higher efficiency, lower memory usage, better accuracy, parallel and GPU learning, and handling large-scale data.

Contributors

3,139

Organizations

482

Software value

$3.4M

DeepRec

The mission of the Project is to develop a high-performance recommendation deep learning framework.

Contributors

2,726

Organizations

250

Software value

$149M

Sentence Transformers

Sentence Transformers is a Python framework for state-of-the-art sentence and text embeddings. It provides easy-to-use methods to compute dense vector representations for sentences, paragraphs and images, enabling semantic similarity comparisons and information retrieval tasks.

Contributors

2,492

Organizations

458

Software value

$2.3M

Whisper

Whisper is an automatic speech recognition (ASR) system developed by OpenAI that can transcribe and translate spoken language from audio into text. It is trained on a large dataset of multilingual speech data and can handle various languages, accents, and acoustic environments.

Contributors

2,140

Organizations

294

Software value

$606K

FATE Project

FATE is an open-source project initiated by Webankâ€™s AI Department to provide a secure computing framework to support the federated AI ecosystem.

Contributors

1,642

Organizations

108

Software value

$36M

Stable Baselines3 (SB3)

Stable Baselines3 (SB3) is a reliable implementation of reinforcement learning algorithms in PyTorch. It provides a set of high-quality implementations of state-of-the-art reinforcement learning algorithms, including PPO, A2C, DQN, and SAC. The library focuses on providing clean, documented, and reliable implementations for research and development in reinforcement learning.

Contributors

1,446

Organizations

227

Software value

$745K

GGML

GGML is a tensor library for machine learning that enables efficient neural network inference on CPU. It provides low-level primitives for implementing deep learning models with a focus on performance and memory efficiency, particularly for running large language models on consumer hardware.

Contributors

1,421

Organizations

268

Software value

$6.7M

Ludwig

Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system. Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks.

Contributors

942

Organizations

155

Software value

$8.7M

Adversarial Robustness Toolbox

Adversarial Robustness Toolbox (ART) provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats.

Contributors

708

Organizations

Software value

$5.3M

Recommenders

The mission of the Project is to develop examples and best practices for building recommendation systems, provided as Jupyter notebooks.

Contributors

525

Organizations

111

Software value

$3M

Natural Language Toolkit (NLTK)

NLTK (Natural Language Toolkit) is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

Contributors

507

Organizations

Software value

$4.1M

Elyra

The mission of the Project is to create and maintain an open-source development workspace that simplifies the creation and orchestration of the AI model development lifecycle tasks.

Contributors

488

Organizations

Software value

$23M

Neural Network (NN) Streamer

? Neural Network (NN) Streamer, Stream Processing Paradigm for Neural Network Apps/Devices.

Contributors

445

Organizations

Software value

$30M

OpenFL

The mission of the OpenFL projet is to build a flexible, secure, scalable and easily learnable Federated Learning tool for data scientists and data owners.

Contributors

311

Organizations

Software value

$2.5M

DocArray

The mission of the DocArray project is to develop a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh.

Contributors

308

Organizations

Software value

$1.6M

Faiss

Faiss is a library for efficient similarity search and clustering of dense vectors, developed by Facebook Research. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also includes support for different similarity metrics and various optimization methods for fast and accurate vector search.

Contributors

265

Organizations

Software value

$4.8M

SapientML

The mission of the Project is to help data scientists rapidly create and amend AI models.

Contributors

Organizations

Software value

$1.5M

BeyondML

The mission of the Project is to advance the state of artificial intelligence by designing and implementing an open-source framework for developing sparse, optimized neural networks capable of efficiently performing multiple tasks across multiple data domains.

Contributors

Organizations

Software value

$6M

Apache Mahout

Mirror of Apache Mahout

This project hasn't been onboarded to LFX Insights.

BugBug

Platform for Machine Learning projects on Software Engineering

This project hasn't been onboarded to LFX Insights.

Caffe

Caffe: a fast open framework for deep learning.

This project hasn't been onboarded to LFX Insights.

Chainer

A flexible framework of neural networks for deep learning

This project hasn't been onboarded to LFX Insights.

DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

This project hasn't been onboarded to LFX Insights.

FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/

This project hasn't been onboarded to LFX Insights.

FEDML

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.

This project hasn't been onboarded to LFX Insights.

Flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

This project hasn't been onboarded to LFX Insights.

GNINA

A deep learning framework for molecular docking

This project hasn't been onboarded to LFX Insights.

GPy

Gaussian processes framework in python

This project hasn't been onboarded to LFX Insights.

GROBID

A machine learning software for extracting information from scholarly documents

This project hasn't been onboarded to LFX Insights.

Generative AI with Gemini on Vertex AI

Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI

This project hasn't been onboarded to LFX Insights.

Gensim

Topic Modelling for Humans

This project hasn't been onboarded to LFX Insights.

Gloo

Collective communications library with various primitives for multi-machine training.

This project hasn't been onboarded to LFX Insights.

Graph Data Science

Source code for the Neo4j Graph Data Science library of graph algorithms.

This project hasn't been onboarded to LFX Insights.

Hongbo Miao R&D Lab

A personal research and development (R&D) lab that facilitates the sharing of knowledge.

This project hasn't been onboarded to LFX Insights.

Hugging Face JS

Utilities to use the Hugging Face Hub API

This project hasn't been onboarded to LFX Insights.

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

This project hasn't been onboarded to LFX Insights.

Kornia

🐍 Geometric Computer Vision Library for Spatial AI

This project hasn't been onboarded to LFX Insights.

LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

This project hasn't been onboarded to LFX Insights.

Llama Cookbook

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services

This project hasn't been onboarded to LFX Insights.

Lux.jl

Elegant and Performant Scientific Machine Learning in Julia

This project hasn't been onboarded to LFX Insights.

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/README.md)

This project hasn't been onboarded to LFX Insights.

NNlib

Neural Network primitives with multiple backends

This project hasn't been onboarded to LFX Insights.

NetKet

Machine learning algorithms for many-body quantum systems

This project hasn't been onboarded to LFX Insights.

Nilearn

Machine learning for NeuroImaging in Python

This project hasn't been onboarded to LFX Insights.

ONNX Runtime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

This project hasn't been onboarded to LFX Insights.

Optax

Optax is a gradient processing and optimization library for JAX.

This project hasn't been onboarded to LFX Insights.

Optuna

A hyperparameter optimization framework

This project hasn't been onboarded to LFX Insights.

PennyLane

PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.

This project hasn't been onboarded to LFX Insights.

PySR

High-Performance Symbolic Regression in Python and Julia

This project hasn't been onboarded to LFX Insights.

PyTorch Geometric

Graph Neural Network Library for PyTorch

This project hasn't been onboarded to LFX Insights.

Looking for a project that’s not listed?