4.6 Article

CPG-Based Hierarchical Locomotion Control for Modular Quadrupedal Robots Using Deep Reinforcement Learning

Journal

IEEE ROBOTICS AND AUTOMATION LETTERS
Volume 6, Issue 4, Pages 7193-7200

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LRA.2021.3092647

Keywords

Modular robots; deep reinforcement learning; locomotion control

Categories

Funding

  1. National Nature Science Foundation of China [51922059, 51775305]
  2. Beijing Natural Science Foundation [JQ19010]
  3. State Key Laboratory of Tribology [SKLT2020D22]
  4. National Youth Talent Support Program
  5. State Key Laboratory of Mechanical System and Vibration [MSV202007]

Ask authors/readers for more resources

This study proposes a two-level hierarchical locomotion framework for modular quadrupedal robots, combining low-level CPG and high-level neural network to learn various locomotion tasks through deep reinforcement learning. Simulation results demonstrate that the method is capable of learning multiple locomotion skills with limited prior knowledge, achieving high sample efficiency and robustness.
Modular robots have the potential for an unmatched ability to perform versatile and robust locomotion. However, designing effective and adaptive locomotion controllers for modular robots is challenging, resulting in a number of model-based methods that typically require various forms of prior knowledge. Deep reinforcement learning (DRL) provides a promising model-free approach for locomotion control by trial-and-error. However, current DRL methods often require extensive interaction data, hindering many possible applications. In this letter, a novel two-level hierarchical locomotion framework for modular quadrupedal robots is proposed. The approach combines a low-level central pattern generator (CPG)-based controller with a high-level neural network to learn a variety of locomotion tasks using DRL. The low-level CPG controller is pre-optimized to generate stable rhythmic walking gaits, while the high-level network is trained to modulate the CPG parameters for achieving task goals based on high-dimensional inputs, including the robot states and user commands. The proposed approach is employed on a simulated modular quadruped. With a limited amount of prior knowledge, the proposed method is demonstrated to be capable of learning a variety of locomotion skills such as velocity tracking, path following, and navigating to a target. Simulation results show that the proposed method can achieve higher sample efficiency than the model-free DRL method and are substantially more robust than the baseline methods to external disturbances and irregular terrain.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available