SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks Paper • 2602.12670 • Published 5 days ago • 24
WORLDMEM: Long-term Consistent World Simulation with Memory Paper • 2504.12369 • Published Apr 16, 2025 • 35