MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation Paper • 2508.19320 • Published Aug 26, 2025 • 29
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 10 items • Updated 26 days ago • 559
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control +2 Feb 4, 2025 • 192
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 29 items • Updated 3 days ago • 136