Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 17 days ago • 97
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 14 days ago • 113
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 4 days ago • 133
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 8 days ago • 62
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published 21 days ago • 88
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published 17 days ago • 85
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published 6 days ago • 138
Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding Paper • 2603.13366 • Published 11 days ago • 89