Shaping capabilities with token-level data filtering Paper β’ 2601.21571 β’ Published Jan 29 β’ 27
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs Paper β’ 2601.03559 β’ Published Jan 7 β’ 14
Running on CPU Upgrade Featured 3.06k The Smol Training Playbook π 3.06k The secrets to building world-class LLMs
Video models are zero-shot learners and reasoners Paper β’ 2509.20328 β’ Published Sep 24, 2025 β’ 100
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29, 2025 β’ 219
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper β’ 2502.05171 β’ Published Feb 7, 2025 β’ 154