Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 2 days ago • 153
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 13 days ago • 176
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently Paper • 2602.02619 • Published 11 days ago • 50
HalluCitation Matters: Revealing the Impact of Hallucinated References with 300 Hallucinated Papers in ACL Conferences Paper • 2601.18724 • Published 17 days ago • 7
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 16 days ago • 118
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published 17 days ago • 79
daVinci-Dev: Agent-native Mid-training for Software Engineering Paper • 2601.18418 • Published 18 days ago • 124
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 23 days ago • 54
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 225
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization Paper • 2601.05432 • Published Jan 8 • 166
TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning Paper • 2601.04698 • Published Jan 8 • 10
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 256
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 296
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers Paper • 2508.21148 • Published Aug 28, 2025 • 140
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2, 2025 • 229
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190