Top 6 Open Source TTS Projects: Find Your Ideal Voice Model

July 12, 2024

by kevin

The world of open source AI projects can be dizzying, with countless options vying for your attention. As developers ourselves, helping clients build AI software, we understand the challenge of sifting through myriad projects to find the gems – those with real-world applicability and commercial value.

Many of you have shared your struggles in navigating this landscape, unsure which projects are usable or how to choose between similar ones. That’s why we’ve committed to curating lists of the best open source AI projects, so you can make informed decisions quickly.

We welcome your input! Please share any additions or corrections in the comments. And if you’d like to dive deeper, scan the QR code to join our AI discussion group (be sure to include your occupation).

Note: The order here does not imply ranking. Choose what suits your needs best.

1. Fish Speech

GitHub Stars: 5.1k
Last Updated: July 11
Function: Voice cloning
Features: Specially trained on Chinese, with good results. Struggles with long audio and English support.
Link: https://github.com/fishaudio/fish-speech

2. ChatTTS

GitHub Stars: 27.5k
Last Updated: July 8
Function: Multi-language, multi-speaker support. Fine-grained control over prosody like laughter, pauses, intonation.
Features: Controls laughter, pauses, interjections (with some inaccuracy). Chinese & English only. Adjustable parameters, but same settings may yield different output.
Link: https://github.com/2noise/ChatTTS

3. MARS5-TTS

GitHub Stars: 2.2k
Last Updated: July 5
Function: Voice cloning
Features: Low requirements for sample audio (2-12 seconds). Deep & shallow cloning options. More realistic emotions. Single-person cloning only, no dialogue.
Link: https://github.com/camb-ai/mars5-tts

4. GPT-SoVITS

GitHub Stars: 29k
Last Updated: July 11
Function: Voice cloning
Features: Fine-tunes model with just 1 min of training data for better similarity & realism. 5-sec sample for instant text-to-speech. Adapts to various languages & voice needs. Supports Chinese, English, Japanese. GPU-trained models on Mac significantly underperform other devices. Runs locally, no internet needed.
Link: https://github.com/RVC-Boss/GPT-SoVITS

5. IMS-Toucan

GitHub Stars: 1.3k
Last Updated: July 9
Function: Text-to-speech
Features: 7000 languages, including dialects. Pure Python & PyTorch. Human-machine editing to fine-tune speech to taste. Setup can be complex, especially on non-Linux.
Link: https://github.com/DigitalPhonetics/IMS-Toucan

6. OpenVoice

GitHub Stars: 27.2k
Last Updated: July 6
Function: Voice cloning
Features: Adjusts emotion, speaking style, pauses, etc. Cross-lingual voice cloning. Weak Chinese support. Allows commercial use.
Link: https://github.com/myshell-ai/OpenVoice

This is just a sampling of the many open source TTS projects out there. Please share your own finds in the comments or join our group chat. For those prioritizing quality and stability over localization, there are also excellent non-open source options like ElevenLabs and Heygen’s voice cloning service (which uses 11labs).

We’ll continue curating project lists for various AI functions. We’re also considering roundups of deployable projects or teardowns of successful ones. Let us know if you have suggestions!

Categories: GitHub