DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published Feb 12 • 80
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition Paper • 2602.08439 • Published Feb 9 • 28
GenArena: How Can We Achieve Human-Aligned Evaluation for Visual Generation Tasks? Paper • 2602.06013 • Published Feb 5
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published Feb 2 • 79
SS4D: Native 4D Generative Model via Structured Spacetime Latents Paper • 2512.14284 • Published Dec 16, 2025 • 14
EtCon: Edit-then-Consolidate for Reliable Knowledge Editing Paper • 2512.04753 • Published Dec 4, 2025 • 8
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning Paper • 2512.05111 • Published Dec 4, 2025 • 50
Think Visually, Reason Textually: Vision-Language Synergy in ARC Paper • 2511.15703 • Published Nov 19, 2025 • 9
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation Paper • 2112.02244 • Published Dec 4, 2021
UniREditBench: A Unified Reasoning-based Image Editing Benchmark Paper • 2511.01295 • Published Nov 3, 2025 • 39
Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning Paper • 2510.27606 • Published Oct 31, 2025 • 31
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment Paper • 2510.10201 • Published Oct 11, 2025 • 36
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published Feb 12 • 80
Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models Paper • 2508.00819 • Published Aug 1, 2025 • 63
Adaptive Fast-and-Slow Visual Program Reasoning for Long-Form VideoQA Paper • 2509.17743 • Published Sep 22, 2025
$\text{G}^2$RPO: Granular GRPO for Precise Reward in Flow Models Paper • 2510.01982 • Published Oct 2, 2025 • 7
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers Paper • 2305.17455 • Published May 27, 2023
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation Paper • 2510.18701 • Published Oct 21, 2025 • 68
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence Paper • 2510.24693 • Published Oct 28, 2025 • 19