9 projects
Coqui TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
2,610
356
$11M
Whisper
Whisper is an automatic speech recognition (ASR) system developed by OpenAI that can transcribe and translate spoken language from audio into text. It is trained on a large dataset of multilingual speech data and can handle various languages, accents, and acoustic environments.
2,166
295
$606K
Kaldi Speech Recognition Toolkit
kaldi-asr/kaldi is the official location of the Kaldi project.
2,044
254
$27M
SpeechBrain
SpeechBrain is an open-source speech toolkit built on PyTorch that provides state-of-the-art speech technologies, including speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and speech separation. It features a unified, flexible interface for speech research and applications.
1,414
176
$9.4M
eSpeak NG
eSpeak NG is an open-source speech synthesizer that supports multiple languages and can convert text to speech. It is a fork and continuation of the original eSpeak project, offering improved voice quality, additional language support, and various phonetic improvements.
1,059
172
$2.1M
VOICEVOX
無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのエディター
298
35
$2.4M
DELTA
Delta is a deep learning based end-to-end natural language and speech processing platform. DELTA aims to provide easy and fast experiences for using, deploying, and developing natural language processing and speech models for both academia and industry use cases. DELTA is mainly implemented using TensorFlow and Python 3.
167
19
$2M
Lhotse
Tools for handling speech data in machine learning projects.
torchaudio
Data manipulation and transformation for audio signal processing, powered by PyTorch