Qwen3-ASR Arabic โ UAE Emirati Dialect
Fine-tuned Qwen/Qwen3-ASR-1.7B for UAE Emirati Arabic dialect speech recognition.
Results
| Metric | Zero-shot (base) | Fine-tuned | Improvement |
|---|---|---|---|
| WER | 13.53% | 9.98% | -26% |
| CER | 3.33% | 2.55% | -23% |
Evaluated on 2,497 UAE Arabic validation samples.
What improved
- Matches informal Emirati dialect style (ุดู vs ุดูุก, ุงูุงู ุงูู vs ุงูุฃู ุงูู)
- Removes spurious punctuation that the base model adds
- Better handling of dialect-specific words and expressions
Training Details
- Base model: Qwen/Qwen3-ASR-1.7B (2B params, audio encoder + 1.7B LLM decoder)
- Training data: ~22,500 UAE Emirati Arabic dialect samples from vadimbelsky/UAE_Arabic_English_Bilingual_Dataset_40k
- Strategy: Audio encoder frozen, only LLM decoder fine-tuned (84.4% of params)
- Precision: bfloat16
- Epochs: 3
- Effective batch size: 32 (batch 2 ร gradient accumulation 16)
- Learning rate: 2e-5 with linear schedule
- Gradient checkpointing: enabled
- Text normalization: Diacritics removed, alef/teh marbuta normalized, punctuation stripped
Usage
from qwen_asr import Qwen3ASRModel
model = Qwen3ASRModel.from_pretrained("vadimbelsky/qwen3-asr-arabic-uae")
result = model.transcribe("audio.wav", language="Arabic")
print(result)
Or with transformers directly:
from transformers import AutoModelForCausalLM, AutoProcessor
model = AutoModelForCausalLM.from_pretrained("vadimbelsky/qwen3-asr-arabic-uae")
processor = AutoProcessor.from_pretrained("vadimbelsky/qwen3-asr-arabic-uae")
Limitations
- Trained on synthetic/generated Arabic speech data
- Optimized for UAE Emirati dialect โ may not generalize to other Arabic dialects
- Short utterances only (training data mostly < 20s)
License
Apache 2.0 (same as base model)
- Downloads last month
- 43
Model tree for vadimbelsky/qwen3-asr-arabic-uae
Base model
Qwen/Qwen3-ASR-1.7BSpace using vadimbelsky/qwen3-asr-arabic-uae 1
Evaluation results
- WER on UAE Arabic Validation (2,497 samples)self-reported0.100
- CER on UAE Arabic Validation (2,497 samples)self-reported0.025