FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow Paper • 2603.19598 • Published 5 days ago • 30
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published 7 days ago • 296
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 8 days ago • 145
OmniForcing: Unleashing Real-time Joint Audio-Visual Generation Paper • 2603.11647 • Published 12 days ago • 31
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Paper • 2603.09095 • Published 15 days ago • 28
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion Paper • 2603.06577 • Published 18 days ago • 48
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published 16 days ago • 83
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs Paper • 2603.05890 • Published 19 days ago • 91
WildActor: Unconstrained Identity-Preserving Video Generation Paper • 2603.00586 • Published 24 days ago • 37
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published 19 days ago • 34
Helios Collection Helios: 14B Real-Time Long Video Generation Model can be Cheaper, Faster but Keep Stronger than 1.3B ones • 7 items • Updated 9 days ago • 24
Mode Seeking meets Mean Seeking for Fast Long Video Generation Paper • 2602.24289 • Published 25 days ago • 41
JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation Paper • 2602.19163 • Published about 1 month ago • 14
Solaris: Building a Multiplayer Video World Model in Minecraft Paper • 2602.22208 • Published 27 days ago • 28