88 projects
TensorFlow
TensorFlow is an open-source machine learning framework developed by Google that enables numerical computation and large-scale machine learning. It provides a flexible system for defining and executing computations involving tensors, which are multi-dimensional arrays. The framework supports deep learning and neural networks across multiple platforms and devices.
47,270
6,157
$199M
vLLM
The mission of the Project is to develop an open-source library for fast LLM inference and serving.
22,809
2,436
$75M
scikit-learn
scikit-learn: machine learning in Python
15,327
2,623
$15M
Ultralytics YOLO
Ultralytics YOLO11 🚀
12,549
816
$4M
ONNX
ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.
8,030
996
$51M
LLaMA Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
7,282
430
$3M
Caffe
Caffe: a fast open framework for deep learning.
7,098
1,010
$3.5M
spaCy
spaCy is an industrial-strength natural language processing library for Python, designed for production use. It offers fast and accurate syntactic analysis, named entity recognition, text classification, and more. The library includes pre-trained statistical models and word vectors, and supports deep learning integration.
6,491
1,128
$7.8M
JAX
JAX is a high-performance numerical computing and machine learning library that combines Numpy's familiar API with GPU and TPU hardware acceleration. It features automatic differentiation, just-in-time compilation, and enables writing transformable numerical programs.
6,189
1,133
$20M
XGBoost
XGBoost is a scalable, distributed gradient boosting library that provides parallel tree boosting for machine learning tasks. It implements machine learning algorithms under the gradient boosting framework, offering high performance, flexibility and portability across multiple programming languages and platforms.
5,922
832
$6.4M
PyTorch Geometric
Graph Neural Network Library for PyTorch
5,061
719
$5.3M
Gradio
Gradio is an open-source Python library that enables developers to quickly create customizable web interfaces for machine learning models, data processing pipelines, and other Python functions. It allows for easy demo creation and sharing of ML models with drag-and-drop interfaces, requiring minimal code.
4,512
702
$10M
Detectron2
Detectron2 is a computer vision library developed by Facebook AI Research (FAIR) that implements state-of-the-art object detection algorithms. It provides a modular, flexible platform for implementing and training computer vision models, with support for tasks like object detection, instance segmentation, keypoint detection, and panoptic segmentation.
4,298
603
$2.3M
Llama Models
A collection of large language models (LLMs) developed by Meta AI, including the Llama family of models. These models are designed for natural language processing tasks and are made available for research and commercial use under specific licensing terms.
4,004
670
$425K
CatBoost
CatBoost is a high-performance, open-source gradient boosting library developed by Yandex that implements gradient boosting on decision trees. It provides fast, scalable, and accurate machine learning algorithms for classification, regression, and ranking tasks, with built-in support for categorical features.
3,552
345
$242M
ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
3,509
311
$28M
LightGBM
LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed and efficient with faster training speed and higher efficiency, lower memory usage, better accuracy, parallel and GPU learning, and handling large-scale data.
3,164
483
$3.4M
Gensim
Topic Modelling for Humans
2,811
552
$8.4M
Faiss
Faiss is a library for efficient similarity search and clustering of dense vectors, developed by Facebook Research. It contains algorithms that search in sets of vectors of any size, up to ones that possibly do not fit in RAM. It also includes support for different similarity metrics and various optimization methods for fast and accurate vector search.
2,735
497
$5.1M
DeepRec
The mission of the Project is to develop a high-performance recommendation deep learning framework.
2,726
283
$160M
Natural Language Toolkit (NLTK)
NLTK (Natural Language Toolkit) is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.
2,700
621
$4.1M
Sentence Transformers
Sentence Transformers is a Python framework for state-of-the-art sentence and text embeddings. It provides easy-to-use methods to compute dense vector representations for sentences, paragraphs and images, enabling semantic similarity comparisons and information retrieval tasks.
2,505
459
$2.4M
Flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
2,180
353
$2M
Whisper
Whisper is an automatic speech recognition (ASR) system developed by OpenAI that can transcribe and translate spoken language from audio into text. It is trained on a large dataset of multilingual speech data and can handle various languages, accents, and acoustic environments.
2,166
295
$606K
Kaldi Speech Recognition Toolkit
kaldi-asr/kaldi is the official location of the Kaldi project.
2,044
254
$27M
Optuna
A hyperparameter optimization framework
1,902
329
$1.8M
FATE Project
FATE is an open-source project initiated by Webank’s AI Department to provide a secure computing framework to support the federated AI ecosystem.
1,639
105
$103M
Tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
1,624
426
$1.6M
GGML
GGML is a tensor library for machine learning that enables efficient neural network inference on CPU. It provides low-level primitives for implementing deep learning models with a focus on performance and memory efficiency, particularly for running large language models on consumer hardware.
1,532
274
$7.9M
Stable Baselines3 (SB3)
Stable Baselines3 (SB3) is a reliable implementation of reinforcement learning algorithms in PyTorch. It provides a set of high-quality implementations of state-of-the-art reinforcement learning algorithms, including PPO, A2C, DQN, and SAC. The library focuses on providing clean, documented, and reliable implementations for research and development in reinforcement learning.
1,460
219
$748K
TorchMetrics
Machine learning metrics for distributed, scalable PyTorch applications.
1,424
354
$3.2M
SpeechBrain
SpeechBrain is an open-source speech toolkit built on PyTorch that provides state-of-the-art speech technologies, including speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and speech separation. It features a unified, flexible interface for speech research and applications.
1,414
176
$9.4M
mlpack
mlpack is a fast, flexible machine learning library written in C++ that aims to provide fast, extensible implementations of cutting-edge machine learning algorithms. It offers bindings for multiple languages and emphasizes scalability, speed, and ease-of-use through both command-line and C++ interfaces.
1,393
234
$13M
UMAP
UMAP (Uniform Manifold Approximation and Projection) is a dimension reduction technique for machine learning and data visualization. It helps visualize high-dimensional data by finding a low-dimensional representation that preserves the essential structure of the data. The algorithm is particularly effective at preserving both local and global structure in the data while being computationally efficient.
1,364
288
$4.9M
NVIDIA DALI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
1,260
220
$14M
cuML
cuML - RAPIDS Machine Learning Library
1,097
172
$7.1M
Chainer
A flexible framework of neural networks for deep learning
1,021
210
$7.7M
Ludwig
Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system. Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks.
938
152
$14M
PyTensor
PyTensor allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.
903
126
$7.5M
Nilearn
Machine learning for NeuroImaging in Python
900
182
$5.7M
PennyLane
PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network.
828
112
$23M
GPy
Gaussian processes framework in python
769
141
$1.8M
Adversarial Robustness Toolbox
Adversarial Robustness Toolbox (ART) provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats.
709
66
$6.9M
Zygote
Zygote is a source-to-source automatic differentiation (AD) system for the Julia programming language. It enables efficient gradient computation for machine learning and scientific computing applications by directly transforming Julia source code.
646
228
$278K
Turing.jl
Bayesian inference with probabilistic programming.
605
154
$284K
gnina
A deep learning framework for molecular docking
544
74
$4.3M
Recommenders
The mission of the Project is to develop examples and best practices for building recommendation systems, provided as Jupyter notebooks.
515
109
$3M
ilastik
ilastik-shell, applets, and workflows to string them together.
515
76
$5.8M
Elyra
The mission of the Project is to create and maintain an open-source development workspace that simplifies the creation and orchestration of the AI model development lifecycle tasks.
486
94
$23M
PySR
PySR is a symbolic regression tool that uses genetic programming to discover mathematical expressions from data. It can find simple, interpretable equations from complex datasets by optimizing both accuracy and simplicity, making it useful for scientific discovery and machine learning applications.
472
66
$415K
Neural Network (NN) Streamer
? Neural Network (NN) Streamer, Stream Processing Paradigm for Neural Network Apps/Devices.
455
50
$34M
XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
424
93
$105M
bugbug
Bugbug is a machine learning tool developed by Mozilla that helps automate bug management and classification in software development. It uses historical bug data to train models that can predict bug characteristics, severity, and component assignments, helping streamline the bug triage process.
411
89
$2.1M
FedML
FedML is an open-source library and MLOps platform for federated learning research and production deployment. It provides a unified framework for distributed/federated training, serving, and mobile/IoT deployment with consistent APIs and modular architecture.
402
57
$7.9M
Enzyme Automatic Differentiator
Enzyme is an automatic differentiation tool that performs reverse-mode AD by using LLVM compiler infrastructure to differentiate programs in languages like Julia, C/C++, and Fortran. It enables efficient gradient computation for machine learning and scientific computing applications.
368
130
$1.8M
NetKet
Machine learning algorithms for many-body quantum systems
316
44
$2.9M
Quarkus LangChain4j
Quarkus LangChain4j is an integration project that enables the use of LangChain4j, a Java framework for building applications with large language models (LLMs), within Quarkus applications. It provides Quarkus extensions and configurations for working with AI/ML models and language processing capabilities.
312
64
$6.8M
DocArray
The mission of the DocArray project is to develop a library for nested, unstructured, multimodal data in transit, including text, image, audio, video, 3D mesh.
309
46
$439M
OpenFL
The mission of the OpenFL projet is to build a flexible, secure, scalable and easily learnable Federated Learning tool for data scientists and data owners.
308
42
$2.9M
Lux
Elegant and Performant Scientific Machine Learning in Julia
230
77
$1.7M