Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Paper • 2604.08224 • Published Apr 9 • 51
view article Article Compute and Competition in AI: Different FlOPs for Different Folks sasha • Feb 12 • 15
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 111
view article Article There is no such thing as a tokenizer-free lunch catherinearnett • Sep 25, 2025 • 98
view article Article An Analysis of Multilingual Models on Hugging Face catherinearnett • Sep 18, 2025 • 5
view article Article 🇵🇠FilBench - Can LLMs Understand and Generate Filipino? +7 ljvmiranda921, acocodes, connermanuel, jcblaise, jcblaise, josephimperial, davanstrien, SaylorTwift, clefourrier • Aug 12, 2025 • 23
Reward Bench 2 Collection Datasets, spaces, and models for Reward Bench 2 benchmark and paper! • 11 items • Updated Dec 23, 2025 • 16
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 99
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks Paper • 2504.15521 • Published Apr 22, 2025 • 64
SEA-VL: Multicultural VL Dataset for Southeast Asia Collection Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia • 3 items • Updated Apr 12, 2025 • 21
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10, 2025 • 101
Bridging the Data Provenance Gap Across Text, Speech and Video Paper • 2412.17847 • Published Dec 19, 2024 • 13
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks • 8 items • Updated Jul 31, 2025 • 34
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark S Collection SEACrowd is a community movement project aimed at centralizing and standardizing AI resources for Southeast Asian languages, cultures, and/or regions. • 3 items • Updated Jun 18, 2024 • 8
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 68