Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 3 days ago • 18
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain Paper • 2509.26507 • Published Sep 30, 2025 • 550
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated Feb 25 • 134
ICONN 1 GenAI Collection Video and Image generation ICONN 1 models • 2 items • Updated Jun 18, 2025 • 2