LightBlue TTS ๐ฎ๐ฑ
Model Description
LightBlue is a state-of-the-art, lightning-fast Text-to-Speech (TTS) model built from scratch specifically for Hebrew (with English support). It is designed to produce 100% native Israeli-sounding speech with perfect handling of Nikud (vowels) and complex homographs, without compromising on inference speed.
It is fast enough to generate an entire 1-hour audiobook in just 3 seconds on a modern GPU.
- Developer: LightBlue TTS
- Language(s): Hebrew (Primary), English
- Model Type: Text-to-Speech (TTS)
- Demo & Website: https://lightbluetts.com/
- GitHub Repository: https://github.com/maxmelichov/Light-BlueTTS
Key Features
- Blazing Fast Inference:
- 1260x real-time on an NVIDIA RTX 3090 (21 minutes of audio generated per second).
- 35x real-time on standard CPUs.
- 20x real-time on Apple M1 chips.
- Native Hebrew Quality: Features a real Israeli accent, correct stress placements, and native-level flow.
- Advanced Contextual Understanding: Passes the "Homograph Test" (e.g., correctly distinguishing between ืฆืคื as "watched" vs "floated", or ืชืจื as "spinach" vs "go down").
- Multiple Voices: Includes high-quality voices like Yonatan (Hebrew only) and Rotem.
Uses
Direct Use
- Generating high-quality Hebrew audio from text.
- Real-time TTS applications running on standard CPUs or edge devices.
- Audiobooks, accessibility tools, virtual assistants, and automated broadcasting.
Speed Benchmarks
LightBlue is optimized for extreme speed without sacrificing naturalness:
| Hardware | Speed | Time for 1 Hour of Audio |
|---|---|---|
| NVIDIA RTX 3090 | 1260x real-time | ~3 seconds |
| Standard CPU | 35x real-time | ~1.7 minutes |
| Apple M1 | 20x real-time | ~3 minutes |
How to Get Started
To use this model, you can clone the official GitHub repository and install the requirements: