Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
tomg-group-umd
's Collections
Sphere Encoder
MTP-LM
Retrofitting Recurrence
DynaGuard
Refusal Token Models
FictionalQA
LoRI Adapters
Gemstone Models
Recurrent Models
Style Descriptors
GenQA
CLRS-Text
Zero-Shot Grafting
PixelProse
Goldfish Loss: Mitigating Memorization in LLMs
MTP-LM
updated
4 days ago
Models to accompany "Multi-Token Prediction via Self-Distillation" (arxiv:2602.06019)
Upvote
3
Multi-Token Prediction via Self-Distillation
Paper
•
2602.06019
•
Published
Feb 5
•
1
jwkirchenbauer/L3-1-8B-Magpie-MTP
8B
•
Updated
Feb 10
•
8
jwkirchenbauer/Qwen3-4B-Inst-2507-MTP
4B
•
Updated
Feb 10
•
5
•
1
jwkirchenbauer/metamathqa-grouped-split
Viewer
•
Updated
Feb 9
•
395k
•
81
Upvote
3
Share collection
View history
Collection guide
Browse collections