Learning Soccer Skills for Humanoid Robots: A Progressive Perception-Action Framework

Published in arXiv (In Submission), 2026

Soccer presents a significant challenge for humanoid robots, demanding tightly integrated perception-action capabilities for tasks like perception-guided kicking and whole-body balance control. Existing approaches suffer from inter-module instability in modular pipelines or conflicting training objectives in end-to-end frameworks.

We propose Perception-Action integrated Decision-making (PAiD), a progressive architecture that decomposes soccer skill acquisition into three stages:

  1. Motion-Skill Acquisition via human motion tracking
  2. Lightweight Perception-Action Integration for positional generalization
  3. Physics-Aware Sim-to-Real Transfer

This staged decomposition establishes stable foundational skills, avoids reward conflicts during perception integration, and minimizes sim-to-real gaps.

Experiments on the Unitree G1 demonstrate high-fidelity human-like kicking with robust performance under diverse conditions—including static or rolling balls, various positions, and disturbances—while maintaining consistent execution across indoor and outdoor scenarios. Our divide-and-conquer strategy advances robust humanoid soccer capabilities and offers a scalable framework for complex embodied skill acquisition.

Authors: Jipeng Kong, Xinzhe Liu, Yuhang Lin, Jinrui Han, Sören Schwertfeger, Chenjia Bai, Xuelong Li

Project Page: https://soccer-humanoid.github.io/

arXiv: https://arxiv.org/abs/2602.05310

Recommended citation: Kong, J., Liu, X., Lin, Y., Han, J., Schwertfeger, S., Bai, C., & Li, X. (2026). "Learning Soccer Skills for Humanoid Robots: A Progressive Perception-Action Framework." arXiv preprint arXiv:2602.05310.
Download Paper