SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published 12 days ago • 93
V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts Paper • 2603.10848 • Published Mar 11 • 14
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs Paper • 2602.03048 • Published Feb 3 • 32
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning Paper • 2601.21468 • Published Jan 29 • 25
Examining False Positives under Inference Scaling for Mathematical Reasoning Paper • 2502.06217 • Published Feb 10, 2025 • 1