Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration? Paper • 2603.03202 • Published Mar 3 • 17
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 Paper • 2602.14457 • Published Feb 16 • 29
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published Jan 26 • 125
IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks Paper • 2506.16402 • Published Jun 19, 2025 • 1
Geometrically-Constrained Agent for Spatial Reasoning Paper • 2511.22659 • Published Nov 27, 2025 • 41
Geometrically-Constrained Agent for Spatial Reasoning Paper • 2511.22659 • Published Nov 27, 2025 • 41