IntentGuard β€” Financial Services

License Accuracy Size Latency Format

Production-ready vertical intent classifier for LLM chatbot guardrails. Classifies user messages as allow, deny, or abstain to keep financial services chatbots on-topic and secure.

Research Article | perfecXion.ai | Finance Model | Healthcare Model | Legal Model


IntentGuard Model Family

IntentGuard provides specialized intent classifiers for high-stakes verticals where chatbot misuse carries regulatory, legal, or safety risk:

Model Vertical Accuracy Off-Topic Pass Rate Link
intentguard-finance Financial Services 99.6% 0.00% This model
intentguard-healthcare Healthcare & Clinical 98.9% 0.98% perfecXion/intentguard-healthcare
intentguard-legal Legal & Compliance 97.9% 0.50% perfecXion/intentguard-legal

Overview

The Problem

Enterprise chatbots in regulated industries face a critical challenge: users inevitably ask off-topic questions (sports, entertainment, relationship advice) that the underlying LLM will happily answer β€” exposing the organization to compliance risk, brand damage, and potential liability.

Traditional keyword filters miss nuanced off-topic queries, while LLM-based guardrails are too slow and expensive for real-time inference.

The Solution

IntentGuard uses a tiny, purpose-trained DeBERTa-v3-xsmall model (22M parameters, 2.5MB quantized) to classify user intent in <30ms on CPU. The three-way classification (allow/deny/abstain) enables precise control:

  • Allow β€” On-topic for the vertical, pass to the LLM
  • Deny β€” Clearly off-topic, block with a polite redirect
  • Abstain β€” Ambiguous, escalate to secondary classifier or human review

Performance

Metric Value
Overall Accuracy 99.6%
Legitimate Block Rate 0.00% (no false positives)
Off-Topic Pass Rate 0.00% (no false negatives)
p99 Latency (CPU) <30ms
Model Size (ONNX INT8) 2.5MB
Base Parameters 22M (DeBERTa-v3-xsmall)
Expected Calibration Error <0.03

Classification Decision Framework

User Message β†’ Tokenize β†’ DeBERTa Inference β†’ Softmax
                                                  ↓
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                    β”‚              β”‚              β”‚
                                  ALLOW          DENY         ABSTAIN
                                (on-topic)    (off-topic)    (uncertain)
                                    β”‚              β”‚              β”‚
                                Pass to LLM   Block + Redirect  Escalate

Model Details

Property Value
Architecture DeBERTa-v3-xsmall (fine-tuned for 3-way classification)
Format ONNX (INT8 quantized)
Version 1.0
Vertical Finance (Financial Services)
Training Supervised fine-tuning on curated intent datasets
Quantization INT8 via ONNX Runtime
GPU Required No β€” runs on CPU
Publisher perfecXion.ai

Core Topics (Allow)

Banking, lending, credit, payments, investing, insurance, tax, personal finance, retirement, mortgages, financial planning, budgeting

Hard Exclusions (Deny)

Sports, entertainment, cooking, gaming, celebrity gossip, fashion, travel/leisure, fiction writing, relationship advice


Usage

Python (ONNX Runtime)

import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("perfecXion/intentguard-finance")
session = ort.InferenceSession("model.onnx")

# Classify a user message
text = "What are the current mortgage rates for a 30-year fixed loan?"
inputs = tokenizer(text, return_tensors="np", max_length=128, truncation=True, padding="max_length")

logits = session.run(None, {
    "input_ids": inputs["input_ids"],
    "attention_mask": inputs["attention_mask"]
})[0]

labels = ["allow", "deny", "abstain"]
prediction = labels[np.argmax(logits)]
confidence = float(np.max(np.exp(logits) / np.sum(np.exp(logits))))

print(f"Intent: {prediction} (confidence: {confidence:.3f})")
# Output: Intent: allow (confidence: 0.998)

Docker

# Pull and run the container
docker pull ghcr.io/perfecxion/intentguard:finance-1.0
docker run -p 8080:8080 ghcr.io/perfecxion/intentguard:finance-1.0

# Classify a message
curl -X POST http://localhost:8080/v1/classify \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What are the current mortgage rates?"}]}'

# Response: {"intent": "allow", "confidence": 0.998}

pip

pip install intentguard

# Python usage
from intentguard import IntentGuard

guard = IntentGuard.load("finance")
result = guard.classify("What are the current mortgage rates?")
print(result)  # Intent(label='allow', confidence=0.998)

Example Classifications

User Message Predicted Confidence Correct?
"What are mortgage rates for a 30-year fixed?" allow 0.998 βœ…
"How do I open a Roth IRA?" allow 0.997 βœ…
"Who won the Super Bowl?" deny 0.999 βœ…
"Tell me a joke" deny 0.996 βœ…
"Is my health insurance FSA-eligible?" allow 0.942 βœ… (financial context)
"What's the weather today?" deny 0.998 βœ…

Citation

@misc{thornton2025intentguard,
  title={IntentGuard: A Production-Grade Vertical Intent Classifier for LLM Guardrails},
  author={Thornton, Scott},
  year={2025},
  publisher={perfecXion.ai},
  url={https://perfecxion.ai/articles/intentguard-vertical-intent-classifier-llm-guardrails.html},
  note={Model: https://huggingface.co/perfecXion/intentguard-finance}
}

Quality Metrics

Metric Result
Accuracy (Finance vertical) 99.6%
Legitimate Block Rate 0.00%
Off-Topic Pass Rate 0.00%
Expected Calibration Error <0.03
ONNX INT8 Quantization Validated
CPU Inference (p99) <30ms
Docker Container Available

License

Apache 2.0


Links

Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results