SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 124
Systran/faster-whisper-large-v3 Automatic Speech Recognition • Updated Nov 23, 2023 • 627k • 524
deepseek-ai/DeepSeek-R1-0528 Text Generation • 685B • Updated May 29, 2025 • 1.05M • • 2.4k