2 projects
Hugging Face Datasets
Hugging Face Datasets is a library and ecosystem that provides easy access to and sharing of machine learning datasets. It offers tools for downloading, preparing, and efficiently loading datasets in various formats, with features for data streaming, caching, and version control. The library integrates seamlessly with popular ML frameworks and includes a large community-driven repository of public datasets.
4,221
844
$2.5M
Common Voice
Common Voice is an open source initiative to create a public dataset of diverse voice recordings for training speech recognition systems. It allows users to donate their voices by recording phrases and validate others' recordings, making speech technology more accessible across languages.
2,785
398
$812M