Model Card for Model ID

Model Details

Model Description

This model is a fine-tuned version of opeanai/whisper-small on Fleurs Dataset.

Uses

This model is used to predict the transcription of indonesian audio.

How to Get Started with the Model

Use the code below to get started with the model.

Convert to ct2 first !ct2-transformers-converter --model cobrayyxx/whisper-small-indo-transcription --output_dir cobrayyxx/whisper-small-indo-transcription-ct2 --copy_files tokenizer.json preprocessor_config.json --quantization float16

Load the ct2 model

from faster_whisper import WhisperModel
model_transcribe = WhisperModel(model_transcribe, device="cpu", compute_type="float32")

Training Details

Model Details

Model Overview

Framework: Hugging Face Transformers
Training Steps: 100 steps
Epochs: Approximately 0.56
Training Loss: 0.3916
Model Purpose: [Specify your task here, e.g., text classification, summarization, etc.]
Performance Metrics
Train Runtime: 458.31 seconds
Train Samples per Second: 3.491
Train Steps per Second: 0.218
Total Floating Point Operations (FLOPs): 4.62 × 10^17

Next Steps

Doing evaluation for this model

Citation

@misc{radford2022whisper,
  doi = {10.48550/ARXIV.2212.04356},
  url = {https://arxiv.org/abs/2212.04356},
  author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
  title = {Robust Speech Recognition via Large-Scale Weak Supervision},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}