Autoregressive Video Generation without Vector Quantization Paper • 2412.14169 • Published Dec 18, 2024 • 15
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models Paper • 2502.06788 • Published Feb 10, 2025 • 14
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Paper • 2410.18558 • Published Oct 24, 2024 • 19
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published Dec 19, 2024 • 59
CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models Paper • 2506.07463 • Published Jun 9, 2025 • 12
UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models Paper • 2604.18518 • Published 3 days ago • 3
Uniform Discrete Diffusion with Metric Path for Video Generation Paper • 2510.24717 • Published Oct 28, 2025 • 44
Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 115
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 83
Dolma Collection allenai's Dolma dataset as native Hugging Face datasets • 12 items • Updated Sep 8, 2025 • 8