FeatureBench: Benchmarking Agentic Coding for Complex Feature Development Paper โข 2602.10975 โข Published 10 days ago โข 18
WizardLMTeam/WizardLM_evol_instruct_V2_196k Viewer โข Updated Mar 10, 2024 โข 143k โข 1.9k โข 246
Running Featured 584 LLM-Perf Leaderboard ๐ 584 Explore LLM performance across hardware configurations
Running on CPU Upgrade 13.9k Open LLM Leaderboard ๐ 13.9k Track, rank and evaluate open LLMs and chatbots
Running 1.49k Big Code Models Leaderboard ๐ 1.49k Explore and submit code model evaluations on a leaderboard