whisper.cpp Models for Mobilint NPU

This repository provides all model files needed to run whisper.cpp-mblt, the Mobilint NPU-accelerated fork of whisper.cpp.

Available Files

Model File Size Description
whisper-small ggml-small.bin 466 MB GGML model (tokenizer + weights for CPU fallback)
whisper-small ggml-small-encoder.mxq 93 MB Mobilint NPU encoder
whisper-small ggml-small-decoder.mxq 159 MB Mobilint NPU decoder

Usage

NPU Inference (Mobilint)

# Download all files and run
whisper-cli-mblt \
  -m ggml-small.bin \
  --mxq-encoder ggml-small-encoder.mxq \
  --mxq-decoder ggml-small-decoder.mxq \
  -f audio.wav

# Or auto-download from HuggingFace
whisper-cli-mblt -hf mobilint/whisper-small -f audio.wav

CPU Inference (standard whisper.cpp)

The ggml-small.bin file is also compatible with standard whisper.cpp for CPU-only inference:

whisper-cli -m ggml-small.bin -f audio.wav

Model Details

  • Base model: openai/whisper-small (244M parameters)
  • Languages: 99 languages supported (English, Chinese, German, Spanish, Russian, Korean, French, Japanese, Portuguese, Turkish, Polish, and more)
  • Tasks: Transcription and translation (to English)
  • NPU pipeline: Audio โ†’ mel spectrogram (CPU) โ†’ encoder (NPU, global4) โ†’ decoder (NPU, single core, greedy) โ†’ text

Related Repositories

License

Apache 2.0 (same as the original OpenAI Whisper model)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mobilint/whisper.cpp

Finetuned
(3382)
this model

Collection including mobilint/whisper.cpp