IntentGuard — Financial Services

Production-ready vertical intent classifier for LLM chatbot guardrails. Classifies user messages as allow, deny, or abstain to keep financial services chatbots on-topic and secure.

Research Article | perfecXion.ai | Finance Model | Healthcare Model | Legal Model

IntentGuard Model Family

IntentGuard provides specialized intent classifiers for high-stakes verticals where chatbot misuse carries regulatory, legal, or safety risk:

Model	Vertical	Accuracy	Off-Topic Pass Rate	Link
intentguard-finance	Financial Services	99.6%	0.00%	This model
intentguard-healthcare	Healthcare & Clinical	98.9%	0.98%	perfecXion/intentguard-healthcare
intentguard-legal	Legal & Compliance	97.9%	0.50%	perfecXion/intentguard-legal

Overview

The Problem

Enterprise chatbots in regulated industries face a critical challenge: users inevitably ask off-topic questions (sports, entertainment, relationship advice) that the underlying LLM will happily answer — exposing the organization to compliance risk, brand damage, and potential liability.

Traditional keyword filters miss nuanced off-topic queries, while LLM-based guardrails are too slow and expensive for real-time inference.

The Solution

IntentGuard uses a tiny, purpose-trained DeBERTa-v3-xsmall model (22M parameters, 2.5MB quantized) to classify user intent in <30ms on CPU. The three-way classification (allow/deny/abstain) enables precise control:

Allow — On-topic for the vertical, pass to the LLM
Deny — Clearly off-topic, block with a polite redirect
Abstain — Ambiguous, escalate to secondary classifier or human review

Performance

Metric	Value
Overall Accuracy	99.6%
Legitimate Block Rate	0.00% (no false positives)
Off-Topic Pass Rate	0.00% (no false negatives)
p99 Latency (CPU)	<30ms
Model Size (ONNX INT8)	2.5MB
Base Parameters	22M (DeBERTa-v3-xsmall)
Expected Calibration Error	<0.03

Classification Decision Framework

User Message → Tokenize → DeBERTa Inference → Softmax
                                                  ↓
                                    ┌──────────────┼──────────────┐
                                    │              │              │
                                  ALLOW          DENY         ABSTAIN
                                (on-topic)    (off-topic)    (uncertain)
                                    │              │              │
                                Pass to LLM   Block + Redirect  Escalate

Model Details

Property	Value
Architecture	DeBERTa-v3-xsmall (fine-tuned for 3-way classification)
Format	ONNX (INT8 quantized)
Version	1.0
Vertical	Finance (Financial Services)
Training	Supervised fine-tuning on curated intent datasets
Quantization	INT8 via ONNX Runtime
GPU Required	No — runs on CPU
Publisher	perfecXion.ai

Core Topics (Allow)

Banking, lending, credit, payments, investing, insurance, tax, personal finance, retirement, mortgages, financial planning, budgeting

Hard Exclusions (Deny)

Sports, entertainment, cooking, gaming, celebrity gossip, fashion, travel/leisure, fiction writing, relationship advice

Usage

Python (ONNX Runtime)

import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("perfecXion/intentguard-finance")
session = ort.InferenceSession("model.onnx")

# Classify a user message
text = "What are the current mortgage rates for a 30-year fixed loan?"
inputs = tokenizer(text, return_tensors="np", max_length=128, truncation=True, padding="max_length")

logits = session.run(None, {
    "input_ids": inputs["input_ids"],
    "attention_mask": inputs["attention_mask"]
})[0]

labels = ["allow", "deny", "abstain"]
prediction = labels[np.argmax(logits)]
confidence = float(np.max(np.exp(logits) / np.sum(np.exp(logits))))

print(f"Intent: {prediction} (confidence: {confidence:.3f})")
# Output: Intent: allow (confidence: 0.998)

Docker

# Pull and run the container
docker pull ghcr.io/perfecxion/intentguard:finance-1.0
docker run -p 8080:8080 ghcr.io/perfecxion/intentguard:finance-1.0

# Classify a message
curl -X POST http://localhost:8080/v1/classify \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What are the current mortgage rates?"}]}'

# Response: {"intent": "allow", "confidence": 0.998}

pip

pip install intentguard

# Python usage
from intentguard import IntentGuard

guard = IntentGuard.load("finance")
result = guard.classify("What are the current mortgage rates?")
print(result)  # Intent(label='allow', confidence=0.998)

Example Classifications

User Message	Predicted	Confidence	Correct?
"What are mortgage rates for a 30-year fixed?"	allow	0.998	✅
"How do I open a Roth IRA?"	allow	0.997	✅
"Who won the Super Bowl?"	deny	0.999	✅
"Tell me a joke"	deny	0.996	✅
"Is my health insurance FSA-eligible?"	allow	0.942	✅ (financial context)
"What's the weather today?"	deny	0.998	✅

Citation

@misc{thornton2025intentguard,
  title={IntentGuard: A Production-Grade Vertical Intent Classifier for LLM Guardrails},
  author={Thornton, Scott},
  year={2025},
  publisher={perfecXion.ai},
  url={https://perfecxion.ai/articles/intentguard-vertical-intent-classifier-llm-guardrails.html},
  note={Model: https://huggingface.co/perfecXion/intentguard-finance}
}

Quality Metrics

Metric	Result
Accuracy (Finance vertical)	99.6%
Legitimate Block Rate	0.00%
Off-Topic Pass Rate	0.00%
Expected Calibration Error	<0.03
ONNX INT8 Quantization	Validated
CPU Inference (p99)	<30ms
Docker Container	Available

License

Apache 2.0

Evaluation results

Accuracy
self-reported

99.600
Legitimate Block Rate
self-reported

0.000
Off-Topic Pass Rate
self-reported

0.000

perfecXion
/

intentguard-finance