Sinama Audio Classifier

A CNN-based audio classification model trained to recognise spoken Cebuano / Sinama words from short audio clips.

Usage

Via Inference API

import requests

API_URL = "https://api-inference.huggingface.co/models/YOUR_USERNAME/sinama-translator"
headers = {"Authorization": "Bearer hf_YOUR_TOKEN"}

with open("audio.wav", "rb") as f:
    response = requests.post(API_URL, headers=headers, data=f.read())

print(response.json())
# [{"label": "ako", "score": 0.95}, ...]

Local inference

import tensorflow as tf, json, librosa, numpy as np

model = tf.keras.models.load_model("best_model.keras")
with open("label_map.json") as f:
    label_map = {int(k): v for k, v in json.load(f).items()}

# preprocess your audio the same way as training โ€ฆ
pred = model.predict(features)
print(label_map[pred.argmax()])

Training details

  • Architecture: 3-block CNN (Conv2D โ†’ BN โ†’ ReLU โ†’ MaxPool โ†’ Dropout)
  • Features: 128-bin Mel Spectrogram, 4 s clips, 22 050 Hz
  • Optimiser: Adam
  • Loss: Categorical cross-entropy
Downloads last month
87
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support