HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 5 days ago • 98
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 12 days ago • 112
QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation Paper • 2604.08570 • Published 26 days ago • 123
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 11 days ago • 238
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published 14 days ago • 200
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published 21 days ago • 340
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 195
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published Feb 10 • 202
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning Paper • 2602.07845 • Published Feb 8 • 71
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published Feb 9 • 159
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos Paper • 2602.06949 • Published Feb 6 • 37
On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models Paper • 2602.03392 • Published Feb 3 • 59
RISE-Video: Can Video Generators Decode Implicit World Rules? Paper • 2602.05986 • Published Feb 5 • 27
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published Feb 4 • 99
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published Feb 3 • 64
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published Jan 31 • 323