BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 20 days ago • 52
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads Paper • 2602.09443 • Published 25 days ago • 57
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 260
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Paper • 2412.10302 • Published Dec 13, 2024 • 22
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 91
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17, 2025 • 134
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18, 2025 • 144
view article Article From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease Oct 21, 2022 • 43
Emu3 Collection Emu3: Next-Token Prediction is All You Need • 7 items • Updated about 1 month ago • 80
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 292
MiniCPM-o & MiniCPM-V Collection Multimodal models with leading performance. • 29 items • Updated 5 days ago • 71