GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL Paper • 2602.22190 • Published 15 days ago • 15
PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency Paper • 2602.16745 • Published 22 days ago • 8
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning Paper • 2602.10090 • Published 30 days ago • 51
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 69
GroupRank: A Groupwise Reranking Paradigm Driven by Reinforcement Learning Paper • 2511.11653 • Published Nov 10, 2025 • 58
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4, 2025 • 59
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness Paper • 2504.10514 • Published Apr 10, 2025 • 48
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14, 2024 • 51
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17, 2024 • 51
RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models Paper • 2407.05131 • Published Jul 6, 2024 • 26
Democratizing Reasoning Ability: Tailored Learning from Large Language Model Paper • 2310.13332 • Published Oct 20, 2023 • 16
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Paper • 2309.00267 • Published Sep 1, 2023 • 53