Spatial-DISE: A Unified Benchmark for Evaluating Spatial Reasoning in Vision-Language Models Paper • 2510.13394 • Published Oct 15, 2025
Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage Paper • 2601.01685 • Published Jan 4
PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors Paper • 2605.06455 • Published 7 days ago • 3
PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors Paper • 2605.06455 • Published 7 days ago • 3
PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors Paper • 2605.06455 • Published 7 days ago • 3
So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection Paper • 2505.18660 • Published May 24, 2025 • 2