Pro-HOI: Perceptive Root-guided Humanoid-Object Interaction

Published in arXiv (In Submission), 2026

Executing reliable Humanoid-Object Interaction (HOI) tasks for humanoid robots is hindered by the lack of generalized control interfaces and robust closed-loop perception mechanisms.

We introduce Perceptive Root-guided Humanoid-Object Interaction (Pro-HOI), a generalizable framework for robust humanoid loco-manipulation. Our approach consists of three key components:

  1. Motion Optimization: We collect box-carrying motions suitable for real-world deployment and optimize penetration artifacts through a Signed Distance Field loss.

  2. Training Framework: We propose a novel policy conditioning scheme that conditions on desired root-trajectory while utilizing reference motion exclusively as a reward. This design eliminates the need for intricate reward tuning and establishes root trajectory as a universal interface for high-level planners, enabling simultaneous navigation and loco-manipulation.

  3. Perception Module: To ensure operational reliability, we incorporate a persistent object estimation module that fuses real-time detection with Digital Twin. This allows the robot to autonomously detect slippage and trigger re-grasping maneuvers.

Empirical validation on a Unitree G1 robot demonstrates that Pro-HOI significantly outperforms baselines in generalization and robustness, achieving reliable long-horizon execution in complex real-world scenarios.

Authors: Yuhang Lin, Jiyuan Shi, Dewei Wang, Jipeng Kong, Yong Liu, Chenjia Bai, Xuelong Li

arXiv: https://arxiv.org/abs/2603.01126

Recommended citation: Lin, Y., Shi, J., Wang, D., Kong, J., Liu, Y., Bai, C., & Li, X. (2026). "Pro-HOI: Perceptive Root-guided Humanoid-Object Interaction." arXiv preprint arXiv:2603.01126.
Download Paper