SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents Paper • 2602.12984 • Published 11 days ago • 5