MolHIT: Advancing Molecular-Graph Generation with Hierarchical Discrete Diffusion Models Paper • 2602.17602 • Published 9 days ago • 52
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30, 2025 • 144
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 125
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published 26 days ago • 29
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 24 days ago • 35
Running on Zero 145 Music Flamingo 🎵 145 Answer music questions from uploaded audio or YouTube tracks
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published 30 days ago • 107
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published 30 days ago • 38
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published Jan 2 • 56