view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective Jan 27 • 69
view article Article Ulysses Sequence Parallelism: Training with Million-Token Contexts 24 days ago • 25
view article Article Pre-Train BERT with Hugging Face Transformers and Habana Gaudi Aug 22, 2022 • 10
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 294
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 May 24, 2023 • 176