VieNeu-TTS-v2
Collection
VieNeu-TTS-v2 is an advanced on-device Vietnamese Text-to-Speech (TTS) model with instant voice cloning and English-Vietnamese bilingual support. β’ 4 items β’ Updated β’ 1
VieNeu-Codec is the high-performance audio engine built specifically for the upcoming VieNeu-TTS v2. It is a neural audio codec trained on over 20,000 hours of diverse Vietnamese and English speech data, ensuring state-of-the-art robustness, natural prosody, and crystal-clear audio reconstruction.
This repository provides the optimized ONNX versions of the VieNeu-Codec for production use.
vieneu_decoder.onnx: (FP32) High-fidelity audio decoder for maximum quality.vieneu_decoder_int8.onnx: (INT8) Quantized decoder for fast CPU inference.Combine the speaker embedding with content tokens from your LLM (VieNeu-TTS v2):
sess_dec = ort.InferenceSession("vieneu_decoder.onnx")
audio = sess_dec.run(None, {
"content_ids": ids,
"voice": embedding
})[0]
Author: Pham Nguyen Ngoc Bao
Project: VieNeu-Codec (for VieNeu-TTS v2)
Version: 2.0