期刊
NEURAL COMPUTING & APPLICATIONS
卷 23, 期 7-8, 页码 1843-1850出版社
SPRINGER LONDON LTD
DOI: 10.1007/s00521-012-1249-y
关键词
Adaptive dynamic programming; Reinforcement learning; Policy iteration; Adaptive optimal control; Neural network; Online control; Nonlinear system
资金
- National Natural Science Foundation of China [61034002, 61233001, 61273140]
This paper develops an online algorithm based on policy iteration for optimal control with infinite horizon cost for continuous-time nonlinear systems. In the present method, a discounted value function is employed, which is considered to be a more general case for optimal control problems. Meanwhile, without knowledge of the internal system dynamics, the algorithm can converge uniformly online to the optimal control, which is the solution of the modified Hamilton-Jacobi-Bellman equation. By means of two neural networks, the algorithm is able to find suitable approximations of both the optimal control and the optimal cost. The uniform convergence to the optimal control is shown, guaranteeing the stability of the nonlinear system. A simulation example is provided to illustrate the effectiveness and applicability of the present approach.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据