🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
-
Updated
May 29, 2024 - Python
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Elevate your web applications with the power of JavaScript speech synthesis.
TTS for Arabic (FastPitch) in the ONNX format
ViSpeR: Multilingual Audio-Visual Speech Recognition
Free "public domain" resources : human's spoken languages [English and other]
Data manipulation and transformation for audio signal processing, powered by PyTorch
SALMONN: Speech Audio Language Music Open Neural Network
Machine learning speaker characteristics
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
ModelScope: bring the notion of Model-as-a-Service to life.
A ggml (C++) re-implementation of tortoise-tts. Under construction and seeking contributors.
A suite of speech signal processing tools
VITS-based Voice Conversion focused on simplicity, quality and performance.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Add a description, image, and links to the speech topic page so that developers can more easily learn about it.
To associate your repository with the speech topic, visit your repo's landing page and select "manage topics."