Abstract
A planning-based inverse reinforcement learning algorithm enables real-world robot manipulation learning from observations alone, achieving efficient online transfer learning without prior knowledge or pre-training.
Observational learning requires an agent to learn to perform a task by referencing only observations of the performed task. This work investigates the equivalent setting in real-world robot learning where access to hand-designed rewards and demonstrator actions are not assumed. To address this data-constrained setting, this work presents a planning-based Inverse Reinforcement Learning (IRL) algorithm for world modeling from observation and interaction alone. Experiments conducted entirely in the real-world demonstrate that this paradigm is effective for learning image-based manipulation tasks from scratch in under an hour, without assuming prior knowledge, pre-training, or data of any kind beyond task observations. Moreover, this work demonstrates that the learned world model representation is capable of online transfer learning in the real-world from scratch. In comparison to existing approaches, including IRL, RL, and Behavior Cloning (BC), which have more restrictive assumptions, the proposed approach demonstrates significantly greater sample efficiency and success rates, enabling a practical path forward for online world modeling and planning from observation and interaction. Videos and more at: https://uwrobotlearning.github.io/mpail2/.
Community

This work investigates real-world observational learning, or Inverse Reinforcement Learning from Observation (IRLfO), in which neither demonstration actions nor hand-designed rewards are assumed available. Previously too sample inefficient for real-world training, IRLfO is made possible for visual manipulation through MPAIL2, a planning-based algorithm which can learn a world model from only task observations and interaction.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Beyond Imitation: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models (2026)
- TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation (2026)
- Squint: Fast Visual Reinforcement Learning for Sim-to-Real Robotics (2026)
- Learning from Demonstrations via Capability-Aware Goal Sampling (2026)
- RISE: Self-Improving Robot Policy with Compositional World Model (2026)
- SPARR: Simulation-based Policies with Asymmetric Real-world Residuals for Assembly (2026)
- Failure-Aware RL: Reliable Offline-to-Online Reinforcement Learning with Self-Recovery for Real-World Manipulation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper