When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents Paper • 2602.08235 • Published 26 days ago
When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents Paper • 2602.08995 • Published 26 days ago • 2
When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents Paper • 2602.08995 • Published 26 days ago • 2
When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents Paper • 2602.08995 • Published 26 days ago • 2
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model Paper • 2509.00676 • Published Aug 31, 2025 • 85
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Paper • 2506.21506 • Published Jun 26, 2025 • 52
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Paper • 2506.21506 • Published Jun 26, 2025 • 52
WebDreamer Collection Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents • 6 items • Updated Apr 14, 2025 • 6