At a Glance
System Overview
Contributions
Real-World Autonomous Execution
Continuous Success achieved by RISE
Autonomous execution on three complex tasks.
Comparison: RISE vs. Baselines
Left: RISE (Ours). Right: Baselines (Composite video including RECAP, π0.5, π0.5+DSRL).
Robustness & Generalization
Methodology
Key Component#1: Compositional World Model
Key Component#2: Self-Improving Loop
Bottom: Training stage. The behavior policy is then trained to generate proper action under an advantage-conditioning scheme.
Results
Main Results
RISE outperforms all baselines with +35% on brick sorting, +45% on backpack packing, and +35% on box closing through world model imagination.
Dynamic Brick Sorting: Success (%)
Backpack Packing: Success (%)
Box Closing: Success (%)
Ablations
Ablation studies conducted on the Dynamic Brick Sorting task.
Left: Optimal offline ratio (0.6) balances demonstrations with online rollouts.
Right: Both online actions and states are essential; full RISE with all components achieves best performance.
Ablation: Offline data ratio
Ablation: Online action/state integration
Dynamics Model Comparisons
Abstract
Despite the sustained scaling on model capacity and data acquisition, Vision–Language–Action (VLA) models remain brittle in contact-rich and dynamic manipulation tasks, where minor execution deviations can compound into failures. While reinforcement learning (RL) offers a principled path to robustness, on-policy RL in the physical world is constrained by safety risk, hardware cost, and environment reset. To bridge this gap, we present RISE, a scalable framework of robotic reinforcement learning via imagination. At its core is a Compositional World Model that (i) predicts multi-view future via a controllable dynamics model, and (ii) evaluates imagined outcomes with a progress value model, producing informative advantages for the policy improvement. Such compositional design allows state and value to be tailored by best-suited yet distinct architectures and objectives. These components are integrated into a closed-loop self-improving pipeline that continuously generates imaginary rollouts, estimates advantages, and updates the policy in imaginary space without costly physical interaction. Across three challenging real-world tasks, RISE yields significant improvement over prior art, with more than +35% absolute performance increase in dynamic brick sorting, +45% for backpack packing, and +35% for box closing, respectively.
BibTeX
@article{rise2026,
title={RISE: Self-Improving Robot Policy with Compositional World Model},
author={Yang, Jiazhi and Lin, Kunyang and Li, Jinwei and Zhang, Wencong and Lin, Tianwei and Wu, Longyan and Su, Zhizhong and Zhao, Hao and Zhang, Ya-Qin and Chen, Li and Luo, Ping and Yue, Xiangyu and Li, Hongyang},
year={2026}
}