--- license: mit base_model: - microsoft/deberta-v3-small datasets: - tgupj/tiny-router-data --- # tiny-router `tiny-router` is a compact experimental multi-head routing classifier for short, domain-neutral messages with optional interaction context. It predicts four separate signals that downstream systems or agents can use for update handling, action routing, memory policy, and prioritization. ## What it predicts ``` relation_to_previous: new | follow_up | correction | confirmation | cancellation | closure actionability: none | review | act retention: ephemeral | useful | remember urgency: low | medium | high ``` The model emits these heads independently at inference time, plus calibrated confidences and an `overall_confidence`. ## Intended use - Route short user messages into lightweight automation tiers. - Detect whether a message updates prior context or starts something new. - Decide whether action is required, review is safer, or no action is needed. - Separate disposable details from short-term useful context and longer-term memory candidates. - Prioritize items by urgency. Good use cases: - routing message-like requests in assistants or productivity tools - triaging follow-ups, corrections, confirmations, and closures - conservative automation with review fallback Not good use cases: - fully autonomous high-stakes action without guardrails - domains that need expert reasoning or regulated decisions ## Training data This checkpoint was trained on the synthetic dataset split in: - `data/synthetic/train.jsonl` - `data/synthetic/validation.jsonl` - `data/synthetic/test.jsonl` The data follows a structured JSONL schema with: - `current_text` - optional `interaction.previous_text` - optional `interaction.previous_action` - optional `interaction.previous_outcome` - optional `interaction.recency_seconds` - four label heads under `labels` ## Model details - Base encoder: `microsoft/deberta-v3-small` - Architecture: encoder-only multitask classifier - Pooling: learned attention pooling - Structured features: - canonicalized `previous_action` embedding - `previous_outcome` embedding - learned projection of `log1p(recency_seconds)` - Head structure: - dependency-aware multitask heads - later heads condition on learned summaries of earlier head predictions - Calibration: - post-hoc per-head temperature scaling fit on validation logits This checkpoint was trained with: - `batch_size = 32` - `epochs = 20` - `max_length = 128` - `encoder_lr = 2e-5` - `head_lr = 1e-4` - `dropout = 0.1` - `pooling_type = attention` - `use_head_dependencies = true` ## Current results Held-out test results from `artifacts/tiny-router/eval.json`: - `macro_average_f1 = 0.7848` - `exact_match = 0.4570` - `automation_safe_accuracy = 0.6230` - `automation_safe_coverage = 0.5430` - `ECE = 0.3440` Per-head macro F1: - `relation_to_previous = 0.8415` - `actionability = 0.7982` - `retention = 0.7809` - `urgency = 0.7187` Ablations: - `current_text_only = 0.7058` - `current_plus_previous_text = 0.7478` - `full_interaction = 0.7848` Interpretation: - interaction context helps - actionability and urgency are usable but still imperfect - high-confidence automation is possible only with conservative thresholds ## Limitations - The benchmark is task-specific and internal to this repo. - The dataset is synthetic, so distribution shift to real product traffic is likely. - Label quality on subtle boundaries still matters a lot. - Confidence calibration is improved but not strong enough to justify broad unattended automation. ## Example inference ```json { "relation_to_previous": { "label": "correction", "confidence": 0.94 }, "actionability": { "label": "act", "confidence": 0.97 }, "retention": { "label": "useful", "confidence": 0.76 }, "urgency": { "label": "medium", "confidence": 0.81 }, "overall_confidence": 0.87 } ``` ## How to load This repo uses a custom checkpoint format. Load it with this project: ```python from tiny_router.io import load_checkpoint from tiny_router.runtime import get_device device = get_device(requested_device="cpu") model, tokenizer, config = load_checkpoint("artifacts/tiny-router", device=device) ``` Or run inference with: ```bash uv run python predict.py \ --model-dir artifacts/tiny-router \ --input-json '{"current_text":"Actually next Monday","interaction":{"previous_text":"Set a reminder for Friday","previous_action":"created_reminder","previous_outcome":"success","recency_seconds":45}}' \ --pretty ```