Skip to main content

Language Conditioned Diffusion Planner for Quadruped Mobile Manipulation

He Liang ( University of Oxford )

Loco-manipulation planning skills are crucial for expanding the utility of robots in industrial
and everyday environments. These skills can be evaluated by a system’s ability to coordinate
complex whole-body movements and multiple contact interactions to solve diverse tasks. This
requires two key capabilities: 1) precise coordination between the quadruped’s locomotion and
the arm’s manipulation, and 2) autonomous decision-making based on task instructions, visual
inputs, and state observations. This report presents a hierarchical control and planning framework
for a quadruped mobile manipulation system that addresses both requirements. The framework
consists of three main components: a model-based low-level controller, a language-guided data
generation pipeline, and a high-level diffusion-based policy. The low-level controller, built on
Nonlinear Model Predictive Control (NMPC) and hierarchical optimisation, ensures precise and
stable robot movements by continuously adjusting control inputs based on the system’s dynamics.
The data generation pipeline leverages large language models to convert task instructions into com-
mands, creating a comprehensive dataset of labelled trajectories. The high-level diffusion-based
policy generates adaptive action sequences using visual inputs, language descriptions, and propri-
oception. The low-level controller was validated through simulation and real-world experiments,
demonstrating both high control precision and efficient coordination between the quadruped base
and the arm. The high-level diffusion-based policy and data generation pipeline were evaluated
in a simplified environment, demonstrating robust generalisation across diverse tasks. The policy
seamlessly integrates visual, language, and proprioceptive inputs, enabling real-time replanning
in response to environmental changes, thereby ensuring adaptability and robust task execution.
The retry-enabled data generation process further enhances the policy’s ability to handle complex
scenarios and recover from failures, validating the framework’s effectiveness for real-world robotic
applications.

 

 

Share this: