Video-CoE: Reinforcing Video Event Prediction via Chain of Events Paper • 2603.14935 • Published 6 days ago • 90
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 4 days ago • 116
OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics Paper • 2512.08625 • Published Dec 9, 2025 • 1
MosaicMem: Hybrid Spatial Memory for Controllable Video World Models Paper • 2603.17117 • Published 5 days ago • 82
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published Dec 19, 2025 • 99
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning Paper • 2603.14482 • Published 7 days ago • 13
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing Paper • 2603.19224 • Published 3 days ago • 16
Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models Paper • 2603.18002 • Published 4 days ago • 6
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 6 days ago • 140
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model Paper • 2603.18524 • Published 3 days ago • 48
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model Paper • 2603.18524 • Published 3 days ago • 48
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory Paper • 2601.16296 • Published Jan 22 • 28
Multi-view Pyramid Transformer: Look Coarser to See Broader Paper • 2512.07806 • Published Dec 8, 2025 • 21
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published Oct 17, 2025 • 50
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation Paper • 2505.13215 • Published May 19, 2025 • 29