CityRAG: Stepping Into a City via Spatially-Grounded Video Generation Paper • 2604.19741 • Published 4 days ago • 16
CutClaw: Agentic Hours-Long Video Editing via Music Synchronization Paper • 2603.29664 • Published 25 days ago • 48
Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control Paper • 2602.18422 • Published Feb 20 • 30
EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots Paper • 2602.18071 • Published Feb 20 • 22
VideoWorld 2: Learning Transferable Knowledge from Real-world Videos Paper • 2602.10102 • Published Feb 10 • 14
NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published Jan 4 • 46
Block Cascading: Training Free Acceleration of Block-Causal Video Models Paper • 2511.20426 • Published Nov 25, 2025 • 10