Do Audio-Visual Large Language Models Really See and Hear? Paper • 2604.02605 • Published 8 days ago • 5
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning Paper • 2603.26653 • Published 14 days ago • 18
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 21 days ago • 330