Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published 2 days ago • 37
GEMS: Agent-Native Multimodal Generation with Memory and Skills Paper • 2603.28088 • Published 3 days ago • 66
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning Paper • 2601.18631 • Published Jan 26 • 48
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Paper • 2512.24165 • Published Dec 30, 2025 • 52
VideoSSR: Video Self-Supervised Reinforcement Learning Paper • 2511.06281 • Published Nov 9, 2025 • 25
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17, 2025 • 134
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning Paper • 2510.27492 • Published Oct 30, 2025 • 87
FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting Paper • 2509.24304 • Published Sep 29, 2025 • 5
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published Sep 18, 2025 • 53
Interleaving Reasoning for Better Text-to-Image Generation Paper • 2509.06945 • Published Sep 8, 2025 • 15
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30, 2025 • 90
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy Paper • 2507.01352 • Published Jul 2, 2025 • 60
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction Paper • 2507.02025 • Published Jul 2, 2025 • 35
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models Paper • 2505.23656 • Published May 29, 2025 • 25
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning Paper • 2506.04207 • Published Jun 4, 2025 • 48
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28, 2025 • 132
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow Paper • 2505.17399 • Published May 23, 2025 • 14
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published May 20, 2025 • 62
OpenThinkIMG Collection OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images. • 7 items • Updated 27 days ago • 4