mmBERT-32K Intent Classifier (Merged Model)

Full merged model for intent classification based on mmBERT-32K-YaRN (32K context, multilingual). This is the LoRA adapter merged with the base model for direct inference without PEFT.

Model Details

Base Model: llm-semantic-router/mmbert-32k-yarn
Training Method: LoRA (rank 32) merged into full model
Model Size: ~1.2 GB
Use Case: Production deployment, Rust/Go inference

Training Data

Primary: TIGER-Lab/MMLU-Pro (~12K academic questions)
Supplement: LLM-Semantic-Router/category-classifier-supplement (653 samples including casual "other" examples)

Categories (14 classes)

biology, business, chemistry, computer science, economics, engineering, health, history, law, math, other, philosophy, physics, psychology

Performance

Metric	Score
Test Accuracy	80.0%

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("llm-semantic-router/mmbert32k-intent-classifier-merged")
tokenizer = AutoTokenizer.from_pretrained("llm-semantic-router/mmbert32k-intent-classifier-merged")

# Inference
inputs = tokenizer("How do neural networks learn?", return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    predicted_class = probs.argmax().item()
    confidence = probs[0][predicted_class].item()

# Get label
print(f"Category: {model.config.id2label[str(predicted_class)]}, Confidence: {confidence:.2%}")