Papers
arxiv:2603.04639

RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

Published on Mar 4
· Submitted by
Dai
on Mar 9
Authors:
,
,
,
,
,
,
,

Abstract

RoboMME presents a large-scale standardized benchmark for evaluating vision-language-action models in long-horizon robotic manipulation tasks requiring memory mechanisms.

AI-generated summary

Memory is critical for long-horizon and history-dependent robotic manipulation. Such tasks often involve counting repeated actions or manipulating objects that become temporarily occluded. Recent vision-language-action (VLA) models have begun to incorporate memory mechanisms; however, their evaluations remain confined to narrow, non-standardized settings. This limits their systematic understanding, comparison, and progress measurement. To address these challenges, we introduce RoboMME: a large-scale standardized benchmark for evaluating and advancing VLA models in long-horizon, history-dependent scenarios. Our benchmark comprises 16 manipulation tasks constructed under a carefully designed taxonomy that evaluates temporal, spatial, object, and procedural memory. We further develop a suite of 14 memory-augmented VLA variants built on the π0.5 backbone to systematically explore different memory representations across multiple integration strategies. Experimental results show that the effectiveness of memory representations is highly task-dependent, with each design offering distinct advantages and limitations across different tasks. Videos and code can be found at our website https://robomme.github.io.

Community

Paper author Paper submitter

TL;DR: RoboMME is a new benchmark for memory-augmented robotic manipulation, evaluating how well models remember, reason, and act across temporal, spatial, object, and procedural memory.

Paper author Paper submitter

We also prepare a cool gradio online demo to play: https://huggingface.co/spaces/HongzeFu/RoboMME
Let's test your memory!

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.04639 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.04639 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.04639 in a Space README.md to link it from this page.

Collections including this paper 3