Spaces:
Running
Running
metadata
title: Pocket TTS ONNX Web Demo
emoji: π
colorFrom: yellow
colorTo: pink
sdk: static
app_file: index.html
pinned: false
license: cc-by-4.0
short_description: Multilingual Pocket TTS voice cloning in the browser (CPU)
models:
- KevinAHM/pocket-tts-onnx
custom_headers:
cross-origin-embedder-policy: require-corp
cross-origin-opener-policy: same-origin
cross-origin-resource-policy: cross-origin
Pocket TTS Web Demo
Browser-only Pocket TTS inference with multilingual INT8 ONNX bundles and voice cloning.
Supported Bundles
english_2026-04germanitalianportuguesespanish
The web demo intentionally skips the 24l variants and only ships the current April 2026 English checkpoint.
Features
- Multilingual bundle selector in the UI
- Built-in voices for every shipped language bundle
- Custom voice cloning from uploaded audio
- INT8 ONNX inference in the browser
- Streaming playback with low latency
Bundle Layout
Each language lives under onnx/<language>/ and includes:
bundle.jsontokenizer.modelbos_before_voice.npyvoices.binmimi_encoder_int8.onnxtext_conditioner_int8.onnxflow_lm_main_int8.onnxflow_lm_flow_int8.onnxmimi_decoder_int8.onnx
voices.bin is a local browser asset containing the compact built-in voice states for that language bundle.
Browser Requirements
- Modern browser with WebAssembly support
- Chrome, Edge, Firefox, or Safari
- Secure context (
https://orlocalhost) - Cross-origin isolation headers for threaded ONNX Runtime Web
Voice Cloning
- Select a language bundle.
- Choose a built-in voice or upload your own sample.
- Use a short clean reference clip for best results.
- Generate directly in the browser.
File Structure
pocket-tts-web/
βββ index.html
βββ onnx-streaming.js
βββ inference-worker.js
βββ PCMPlayerWorklet.js
βββ EventEmitter.js
βββ sentencepiece.js
βββ style.css
βββ onnx/
βββ english_2026-04/
βββ german/
βββ italian/
βββ portuguese/
βββ spanish/
License
- Models and bundled voice assets inherit the Pocket TTS licensing terms from kyutai/pocket-tts.
- Code is Apache 2.0.