Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 10 days ago • 7
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 4 days ago • 32
RealMaster: Lifting Rendered Scenes into Photorealistic Video Paper • 2603.23462 • Published 6 days ago • 29
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 6 days ago • 86