KTO: Model Alignment as Prospect Theoretic Optimization Paper • 2402.01306 • Published Feb 2, 2024 • 22
Running on CPU Upgrade 147 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 147 Explore synthetic data experiments in a bookshelf view
Running Featured 65 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 65 Who needs 1T parameters? Olympiad proofs with a 4B model
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration Paper • 2602.01734 • Published Feb 2 • 32
tabularisai/multilingual-sentiment-analysis Text Classification • 0.1B • Updated Feb 6 • 98.9k • • 361
🇩🇪German SFT and DPO datasets Collection Datasets that can be used for LLM training with axolotl, trl or llama_factory. • 32 items • Updated 10 days ago • 13