Running 89 Unlocking On-Policy Distillation for Any Model Family 📝 89 Visualize on-policy distillation for any model family
Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously Paper • 2603.12262 • Published 6 days ago • 28
Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle Paper • 2508.05612 • Published Aug 7, 2025 • 2
Shuffle-R1 Collection Shuffle-R1 checkpoints and training/evaluation datasets. • 5 items • Updated 16 days ago • 1
Shuffle-R1 Collection Shuffle-R1 checkpoints and training/evaluation datasets. • 5 items • Updated 16 days ago • 1
Shuffle-R1 Collection Shuffle-R1 checkpoints and training/evaluation datasets. • 5 items • Updated 16 days ago • 1
Shuffle-R1 Collection Shuffle-R1 checkpoints and training/evaluation datasets. • 5 items • Updated 16 days ago • 1
Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle Paper • 2508.05612 • Published Aug 7, 2025 • 2