The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping Paper • 2604.11297 • Published 2 days ago • 87
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective Paper • 2412.14135 • Published Dec 18, 2024
In-Memory Learning: A Declarative Learning Framework for Large Language Models Paper • 2403.02757 • Published Mar 5, 2024