Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 6 days ago • 9
PABU: Progress-Aware Belief Update for Efficient LLM Agents Paper • 2602.09138 • Published Feb 9 • 1
PABU-Implementation Collection Artifacts related to PABU implementation. • 3 items • Updated Feb 11
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22, 2025 • 443