Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models Paper • 2603.01571 • Published 11 days ago • 33
Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations Paper • 2602.19320 • Published 18 days ago • 9
The Art of Efficient Reasoning: Data, Reward, and Optimization Paper • 2602.20945 • Published 16 days ago • 7
Optimizing Few-Step Generation with Adaptive Matching Distillation Paper • 2602.07345 • Published Feb 7 • 9
Reinforced Fast Weights with Next-Sequence Prediction Paper • 2602.16704 • Published 22 days ago • 13
Multi-agent cooperation through in-context co-player inference Paper • 2602.16301 • Published 22 days ago • 24
Hardware Co-Design Scaling Laws via Roofline Modelling for On-Device LLMs Paper • 2602.10377 • Published 30 days ago • 3
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation Paper • 2601.08430 • Published Jan 13 • 60
Toward Ultra-Long-Horizon Agentic Science: Cognitive Accumulation for Machine Learning Engineering Paper • 2601.10402 • Published Jan 15 • 37
Urban Socio-Semantic Segmentation with Vision-Language Reasoning Paper • 2601.10477 • Published Jan 15 • 155
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Paper • 2512.16676 • Published Dec 18, 2025 • 220
Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs Paper • 2512.17008 • Published Dec 18, 2025 • 11
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published Dec 19, 2025 • 28
Are We on the Right Way to Assessing LLM-as-a-Judge? Paper • 2512.16041 • Published Dec 17, 2025 • 34