LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation Paper • 2603.20192 • Published 4 days ago • 22
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published 7 days ago • 100
EvoClaw: Evaluating AI Agents on Continuous Software Evolution Paper • 2603.13428 • Published 12 days ago • 20
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions Paper • 2603.15612 • Published 8 days ago • 149
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 7 days ago • 127
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 5 days ago • 56
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model Paper • 2603.18524 • Published 6 days ago • 54
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding Paper • 2603.19235 • Published 5 days ago • 90
Visual-ERM: Reward Modeling for Visual Equivalence Paper • 2603.13224 • Published 11 days ago • 21
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement Paper • 2603.12310 • Published 12 days ago • 7
Can Vision-Language Models Solve the Shell Game? Paper • 2603.08436 • Published 15 days ago • 39
Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation Paper • 2603.12793 • Published 12 days ago • 37
MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning Paper • 2603.12266 • Published 12 days ago • 19
Training Language Models via Neural Cellular Automata Paper • 2603.10055 • Published 15 days ago • 7