Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 2 days ago • 110
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale Paper • 2601.22146 • Published Jan 29 • 10
Running on CPU Upgrade 220 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 220 Explore synthetic data experiments on a virtual bookshelf
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning Paper • 2602.10560 • Published Feb 11 • 31
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Paper • 2512.17260 • Published Dec 19, 2025 • 52
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 215
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published Oct 19, 2025 • 112
RePro: Training Language Models to Faithfully Recycle the Web for Pretraining Paper • 2510.10681 • Published Oct 12, 2025 • 6
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24, 2025 • 84
TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training Paper • 2508.17677 • Published Aug 25, 2025 • 14