The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published 2 days ago • 97
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published 3 days ago • 41
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published 8 days ago • 151
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 4 days ago • 53
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data Paper • 2603.25319 • Published 8 days ago • 32
RISE: Self-Improving Robot Policy with Compositional World Model Paper • 2602.11075 • Published Feb 11 • 29
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published Feb 2 • 117
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published Jan 29 • 155
AdaTooler-V: Adaptive Tool-Use for Images and Videos Paper • 2512.16918 • Published Dec 18, 2025 • 14
JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization Paper • 2511.23002 • Published Nov 28, 2025 • 26
From Macro to Micro: Benchmarking Microscopic Spatial Intelligence on Molecules via Vision-Language Models Paper • 2512.10867 • Published Dec 11, 2025 • 16
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Paper • 2512.08294 • Published Dec 9, 2025 • 18
EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published Dec 5, 2025 • 38
OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published Dec 2, 2025 • 34
Architecture Decoupling Is Not All You Need For Unified Multimodal Model Paper • 2511.22663 • Published Nov 27, 2025 • 29
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views Paper • 2510.18632 • Published Oct 21, 2025 • 23
SpaceVista: All-Scale Visual Spatial Reasoning from mm to km Paper • 2510.09606 • Published Oct 10, 2025 • 18
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines Paper • 2509.21320 • Published Sep 25, 2025 • 101