Running 3.75k The Ultra-Scale Playbook 🌌 3.75k The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade Featured 3.05k The Smol Training Playbook 📚 3.05k The secrets to building world-class LLMs
deepseek-ai/DeepSeek-V2-Chat-0628 Text Generation • 236B • Updated Jul 18, 2024 • 3.51k • 177
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 25 items • Updated 20 days ago • 577
The Flan Collection: Designing Data and Methods for Effective Instruction Tuning Paper • 2301.13688 • Published Jan 31, 2023 • 10
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated 10 days ago • 33