·
AI & ML interests
None yet
Organizations
view article AutoBench Goes Scientific: Rigorous Validation for a Dynamic, Open-Source LLM Benchmark
view article AutoBench Third Run: Revolutionizing LLM Evaluation with Record-Breaking Scale, Accuracy, and a New Home at autobench.org
upvoted a paper 11 months ago