Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards
Paper
• 2603.09117 • Published
• 3
None defined yet.
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards
DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation