evaluation-datasets edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 7.63k • 36 TIGER-Lab/MMLU-Pro Benchmark • Updated 2 days ago • 12.1k • 106k • 447 CohereLabs/Global-MMLU Viewer • Updated Aug 14, 2025 • 602k • 9.1k • 150 Idavidrein/gpqa Benchmark • Updated 7 days ago • 1.25k • 94.3k • 379
smoltalk Contains smoltalk dataset in multiple minority languges. The dataset is useful in post-training a base model. rao254/smoltalk-kik Updated Nov 25, 2025 • 6 rao254/smoltalk-ja Viewer • Updated Nov 25, 2025 • 2.05k • 8
medical-datasets ruslanmv/ai-medical-chatbot Viewer • Updated Mar 23, 2024 • 257k • 1.23k • 246 michsethowusu/Code-170k-luo Viewer • Updated Oct 30, 2025 • 169k • 34 edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 7.63k • 36
evaluation-datasets edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 7.63k • 36 TIGER-Lab/MMLU-Pro Benchmark • Updated 2 days ago • 12.1k • 106k • 447 CohereLabs/Global-MMLU Viewer • Updated Aug 14, 2025 • 602k • 9.1k • 150 Idavidrein/gpqa Benchmark • Updated 7 days ago • 1.25k • 94.3k • 379
medical-datasets ruslanmv/ai-medical-chatbot Viewer • Updated Mar 23, 2024 • 257k • 1.23k • 246 michsethowusu/Code-170k-luo Viewer • Updated Oct 30, 2025 • 169k • 34 edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 7.63k • 36
smoltalk Contains smoltalk dataset in multiple minority languges. The dataset is useful in post-training a base model. rao254/smoltalk-kik Updated Nov 25, 2025 • 6 rao254/smoltalk-ja Viewer • Updated Nov 25, 2025 • 2.05k • 8