☆ 4.1 Article

Approximate Robust Policy Iteration Using Multilayer Perceptron Neural Networks for Discounted Infinite-Horizon Markov Decision Processes With Uncertain Correlated Transition Matrices

IEEE TRANSACTIONS ON NEURAL NETWORKS (2010)

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS

Volume 21, Issue 8, Pages 1270-1280

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNN.2010.2050334

Keywords

Approximate dynamic programming; Markov decision processes (MDP); multilayer perceptrons; uncertain transition matrix

Funding

National Science Foundation [ECS-0401405, ECS-0702057]
National Science Foundation of China [50 828 701]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

We study finite-state, finite-action, discounted infinite-horizon Markov decision processes with uncertain correlated transition matrices in deterministic policy spaces. Existing robust dynamic programming methods cannot be extended to solving this class of general problems. In this paper, based on a robust optimality criterion, an approximate robust policy iteration using a multilayer perceptron neural network is proposed. It is proven that the proposed algorithm converges in finite iterations, and it converges to a stationary optimal or near-optimal policy in a probability sense. In addition, we point out that sometimes even a direct enumeration may not be applicable to addressing this class of problems. However, a direct enumeration based on our proposed maximum value approximation over the parameter space is a feasible approach. We provide further analysis to show that our proposed algorithm is more efficient than such an enumeration method for various scenarios.

Approximate Robust Policy Iteration Using Multilayer Perceptron Neural Networks for Discounted Infinite-Horizon Markov Decision Processes With Uncertain Correlated Transition Matrices

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Approximate Robust Policy Iteration Using Multilayer Perceptron Neural Networks for Discounted Infinite-Horizon Markov Decision Processes With Uncertain Correlated Transition Matrices

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper