mmBERT-32K Intent Classifier (Merged Model)

Full merged model for intent classification based on mmBERT-32K-YaRN (32K context, multilingual). This is the LoRA adapter merged with the base model for direct inference without PEFT.

Model Details

  • Base Model: llm-semantic-router/mmbert-32k-yarn
  • Training Method: LoRA (rank 32) merged into full model
  • Model Size: ~1.2 GB
  • Use Case: Production deployment, Rust/Go inference

Training Data

Categories (14 classes)

biology, business, chemistry, computer science, economics, engineering, health, history, law, math, other, philosophy, physics, psychology

Performance

Metric Score
Test Accuracy 80.0%

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("llm-semantic-router/mmbert32k-intent-classifier-merged")
tokenizer = AutoTokenizer.from_pretrained("llm-semantic-router/mmbert32k-intent-classifier-merged")

# Inference
inputs = tokenizer("How do neural networks learn?", return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    predicted_class = probs.argmax().item()
    confidence = probs[0][predicted_class].item()

# Get label
print(f"Category: {model.config.id2label[str(predicted_class)]}, Confidence: {confidence:.2%}")

For Rust/Candle Inference

This merged model is compatible with the candle-binding Rust library for high-performance inference in production systems.

Downloads last month
493
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llm-semantic-router/mmbert32k-intent-classifier-merged

Quantized
(6)
this model

Dataset used to train llm-semantic-router/mmbert32k-intent-classifier-merged

Collection including llm-semantic-router/mmbert32k-intent-classifier-merged