Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models Paper • 2602.04649 • Published 15 days ago • 12
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published Feb 11, 2025 • 57
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 275
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published Apr 8, 2025 • 110
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published Apr 7, 2025 • 26
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Dec 31, 2025 • 356