ContextBench: A Benchmark for Context Retrieval in Coding Agents Paper • 2602.05892 • Published Feb 5 • 4
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence? Paper • 2604.03016 • Published 5 days ago • 26
T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation Paper • 2512.21094 • Published Dec 24, 2025 • 25
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning Paper • 2509.23873 • Published Sep 28, 2025 • 68