Journal
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS
Volume 80, Issue 3-4, Pages 625-640Publisher
SPRINGER
DOI: 10.1007/s10846-015-0196-0
Keywords
Reinforcement learning; Biped locomotion; Movement adaptation category (1)
Funding
- FEDER - Operational Program Competitive Factors - COMPETE
- FCT - Portuguese Science Foundation [UMINHO/BI/40/2012, UMINHO/BI/69/ 2013, PTDC/EEACRO/100655/ 2008, PEst-OE/EEI/UI0319/2014]
- Fundação para a Ciência e a Tecnologia [PTDC/EEA-CRO/100655/2008, PEst-OE/EEI/UI0319/2014] Funding Source: FCT
Ask authors/readers for more resources
In this work, reinforcement learning techniques are implemented and compared to address biped locomotion optimization. Central Pattern Generators (CPGs) and Dynamic Movement Primitives (DMPs) were combined to easily produce complex trajectories for the joints of a simulated DARwIn-OP humanoid robot. Two reinforcement learning algorithms, Policy Learning by Weighting Exploration with the Returns (PoWER) and Path Integral Policy Improvement with Covariance Matrix Adaptation (PI2-CMA) were implemented in the simulated DARwIn-OP to seek optimal DMP parameters that maximize frontal velocity when facing different situations which demand adaptation from the controller in order to successfully walk in different types of slopes. Additionally, elitism was introduced in PI2-CMA in order to improve the performance of the algorithm. Results show that these approaches enabled easy adaptation of DARwIn-OP to new situations. The results are very promising and demonstrate flexibility at generating or adapting new trajectories for locomotion.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available