Mixture of Universal Experts: Scaling Virtual Width via Depth-Width Transformation Paper • 2603.04971 • Published 2 days ago • 3
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published Feb 11, 2025 • 40