☆ 4.6 Article

Hybrid MDP based integrated hierarchical Q-learning

SCIENCE CHINA-INFORMATION SCIENCES (2011)

Journal

SCIENCE CHINA-INFORMATION SCIENCES

Volume 54, Issue 11, Pages 2279-2294

Publisher

SCIENCE PRESS

DOI: 10.1007/s11432-011-4332-6

Keywords

reinforcement learning; hierarchical Q-learning; hybrid MDP; temporal abstraction

Funding

National Natural Science Foundation of China [60805029, 60703083]
National Creative Research Groups Science Foundation of China [60721062]
Fundamental Research Foundations for the Central Universities [2010QNA5014]
City University of Hong Kong [7008057, 9360131]
Australian Research Council [DP1095540]
Australian Research Council [DP1095540] Funding Source: Australian Research Council

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

As a widely used reinforcement learning method, Q-learning is bedeviled by the curse of dimensionality: The computational complexity grows dramatically with the size of state-action space. To combat this difficulty, an integrated hierarchical Q-learning framework is proposed based on the hybrid Markov decision process (MDP) using temporal abstraction instead of the simple MDP. The learning process is naturally organized into multiple levels of learning, e.g., quantitative (lower) level and qualitative (upper) level, which are modeled as MDP and semi-MDP (SMDP), respectively. This hierarchical control architecture constitutes a hybrid MDP as the model of hierarchical Q-learning, which bridges the two levels of learning. The proposed hierarchical Q-learning can scale up very well and speed up learning with the upper level learning process. Hence this approach is an effective integral learning and control scheme for complex problems. Several experiments are carried out using a puzzle problem in a gridworld environment and a navigation control problem for a mobile robot. The experimental results demonstrate the effectiveness and efficiency of the proposed approach.

Hybrid MDP based integrated hierarchical Q-learning

Journal

SCIENCE CHINA-INFORMATION SCIENCES

Publisher

SCIENCE PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Hybrid MDP based integrated hierarchical Q-learning

Journal

SCIENCE CHINA-INFORMATION SCIENCES

Publisher

SCIENCE PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper