☆ 4.6 Article

Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems

IEEE TRANSACTIONS ON CYBERNETICS (2018)

期刊

IEEE TRANSACTIONS ON CYBERNETICS

卷 48, 期 1, 页码 29-40

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCYB.2016.2618926

关键词

Actor-critic; adaptive self-organizing map (SOM); multiple-model; off-policy reinforcement learning (RL); optimal control

类别

Automation & Control Systems Computer Science, Artificial Intelligence Computer Science, Cybernetics

资金

NSF [ECCS-1405173, IIS-1208623]
ONR [N00014-13-1-0562, N000141410718]
Czech Ministry of Education, Youth and Sports [LO1506]
Czech Science Foundation [GA 15-12068S]
Fulbright Program

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

In this paper, motivated by human neurocognitive experiments, a model-free off-policy reinforcement learning algorithm is developed to solve the optimal tracking control of multiple-model linear discrete-time systems. First, an adaptive self-organizing map neural network is used to determine the system behavior from measured data and to assign a responsibility signal to each of system possible behaviors. A new model is added if a sudden change of system behavior is detected from the measured data and the behavior has not been previously detected. A value function is represented by partially weighted value functions. Then, the off-policy iteration algorithm is generalized to multiple-model learning to find a solution without any knowledge about the system dynamics or reference trajectory dynamics. The off-policy approach helps to increase data efficiency and speed of tuning since a stream of experiences obtained from executing a behavior policy is reused to update several value functions corresponding to different learning policies sequentially. Two numerical examples serve as a demonstration of the off-policy algorithm performance.

Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems

期刊

IEEE TRANSACTIONS ON CYBERNETICS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems

期刊

IEEE TRANSACTIONS ON CYBERNETICS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文