LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 6 days ago • 62
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published 13 days ago • 29
Rethinking the Trust Region in LLM Reinforcement Learning Paper • 2602.04879 • Published 11 days ago • 31
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents Paper • 2602.02196 • Published 13 days ago • 33
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents Paper • 2602.02474 • Published 13 days ago • 54
Chain of Mindset: Reasoning with Adaptive Cognitive Modes Paper • 2602.10063 • Published 5 days ago • 70
CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty Paper • 2601.22027 • Published 17 days ago • 80
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published 11 days ago • 93
VidVec: Unlocking Video MLLM Embeddings for Video-Text Retrieval Paper • 2602.08099 • Published 7 days ago • 119
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 17 days ago • 152
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 5 days ago • 174
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 7 days ago • 252
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 11 days ago • 311