Qwen3 TTS 12Hz 1.7B Base — MLX 4-bit

MLX 4-bit quantized conversion of Qwen/Qwen3-TTS-12Hz-1.7B-Base for Apple Silicon inference.

Usage

Used by speech-swift Qwen3TTS module:

let model = try await Qwen3TTSModel.fromPretrained(
    modelId: "aufklarer/Qwen3-TTS-12Hz-1.7B-Base-MLX-4bit"
)
let audio = try model.synthesize("Hello, world!")
audio speak "Hello, world!" --model 1.7b -o output.wav

Model Details

  • Architecture: Qwen3-TTS (Talker transformer + Code Predictor + Mimi speech tokenizer decoder)
  • Parameters: 1.7B
  • Quantization: 4-bit (MLX, talker + code predictor)
  • Size: ~1.7 GB
  • Sample rate: 24 kHz
  • Codec rate: 12.5 Hz

Variants

Variant Quantization Size Model ID
0.6B 4-bit 4-bit ~981 MB aufklarer/Qwen3-TTS-12Hz-0.6B-Base-MLX-4bit
0.6B 8-bit 8-bit ~1.3 GB aufklarer/Qwen3-TTS-12Hz-0.6B-Base-MLX-8bit
1.7B 4-bit 4-bit ~1.7 GB aufklarer/Qwen3-TTS-12Hz-1.7B-Base-MLX-4bit
1.7B 8-bit 8-bit ~2.8 GB aufklarer/Qwen3-TTS-12Hz-1.7B-Base-MLX-8bit


Downloads last month
160
Safetensors
Model size
0.6B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/Qwen3-TTS-12Hz-1.7B-Base-MLX-4bit

Quantized
(10)
this model

Collection including aufklarer/Qwen3-TTS-12Hz-1.7B-Base-MLX-4bit