You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

CDLI Parakeet TDT 1.1B English Fine-Tune (lr=5e-5)

Model architecture Base model Language

This repository contains a NeMo ASR model fine-tuned from nvidia/parakeet-tdt-1.1b on the gated cdli/ugandan_english_nonstandard_speech_v1.0 dataset.

This card documents the stronger 1.1B recovery run using a lower learning rate (5e-5) after the earlier 1e-4 run plateaued early.

Model Details

  • Base model: nvidia/parakeet-tdt-1.1b
  • Fine-tuning framework: NVIDIA NeMo
  • Language: English
  • Acoustic model family: FastConformer-TDT / RNNT-BPE

Dataset

  • Dataset: cdli/ugandan_english_nonstandard_speech_v1.0
  • License: cc-by-sa-4.0
  • Split sizes used by the source dataset card:
    • train: 5176
    • validation: 638
    • test: 1017

The evaluation artifacts in this run contain 1016 scored rows.

Training Configuration

  • Work root: /jupyter_kernel/parakeet_cdli_en_5e5
  • Base checkpoint: nvidia/parakeet-tdt-1.1b
  • Max manifest audio length: 40.0 s
  • Max training audio length: 30.0 s
  • Min audio length: 0.2 s
  • Train batch size: 4
  • Eval batch size: 8
  • Gradient accumulation steps: 8
  • Effective train batch size: 32
  • Learning rate: 5e-5
  • Weight decay: 1e-3
  • Warmup steps: 100
  • Scheduler: CosineAnnealing
  • Max steps configured: 20000
  • Early stopping patience: 10

Evaluation

Evaluation was run on the held-out test split using both raw and normalized transcript comparison.

Corpus Metrics

  • Raw WER: 31.57%
  • Raw CER: 15.09%
  • Normalized WER: 21.20%
  • Normalized CER: 12.56%

Average Utterance Metrics

  • Average normalized utterance WER (capped at 1.0): 20.70%
  • Average normalized utterance CER (capped at 1.0): 12.58%

Files

  • EN-PARAKEET-TDT-F1tdt-1-1b.nemo: exported NeMo checkpoint
  • checkpoints/: intermediate training checkpoints
  • test_predictions.csv
  • test_predictions.jsonl
  • test_predictions_scored.csv
  • test_predictions_scored.jsonl
  • test_predictions_grouped_analysis.csv

Notes

  • This 5e-5 run improved substantially over the earlier 1.1B 1e-4 run.
  • Access to the source dataset is gated. Review the dataset terms before requesting access.
Downloads last month
40
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train KasuleTrevor/cdli-parakeet-11b-en-finetune

Collection including KasuleTrevor/cdli-parakeet-11b-en-finetune

Evaluation results