GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent Paper • 2603.13875 • Published 30 days ago • 35
Running on CPU Upgrade 219 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 219 Explore synthetic data experiments on a virtual bookshelf
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published Feb 27 • 88
view article Article Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek Jan 27 • 45
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18, 2025 • 95
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF Text Generation • 31B • Updated Jan 30 • 141k • 583
Running 3.77k The Ultra-Scale Playbook 🌌 3.77k The ultimate guide to training LLM on large GPU Clusters