Article
Automation & Control Systems
Yongliang Yang, Hamidreza Modares, Kyriakos G. Vamvoudakis, Wei He, Cheng-Zhong Xu, Donald C. Wunsch
Summary: This article explores an iterative adaptive dynamic programming algorithm within the Hamiltonian-driven framework for solving the Hamilton-Jacobi-Bellman equation in continuous time for nonlinear systems. It introduces a novel function "min-Hamiltonian" and develops an iterative ADP algorithm that considers approximation errors during policy evaluation. The article also provides a sufficient condition on the iterative value gradient to ensure closed-loop stability and convergence to optimal value, as well as a model-free extension based on off-policy reinforcement learning.
IEEE TRANSACTIONS ON CYBERNETICS
(2022)
Article
Automation & Control Systems
Antonio Sala, Leopoldo Armesto
Summary: This study introduces a new criterion for adaptive meshing in polyhedral partitions to interpolate value functions, employing an initial condition probability density function, uncertainty propagation, and temporal-difference error to determine the addition of new points. A collection of lemmas justifies the algorithmic proposal, with comparative analysis highlighting the advantages of this proposal over other options in literature. The developed methods are applied in simulation examples and an experimental robotic setup.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2022)
Article
Automation & Control Systems
Mingming Ha, Ding Wang, Derong Liu
Summary: This article investigates the stability of the closed-loop system using various control policies generated by value iteration. It develops an offline integrated value iteration scheme with a stability guarantee and an online adaptive dynamic programming algorithm based on the concept of attraction domain. The theoretical and numerical results confirm the effectiveness of the algorithms in maintaining system stability.
IEEE TRANSACTIONS ON CYBERNETICS
(2022)
Article
Management
Cristiano Cervellera
Summary: Approximate dynamic programming (ADP) is a standard tool for solving multistage dynamic optimization problems. This paper investigates the use of ensemble learning in the ADP context to approximate the value function. The ensemble of value function approximations improves accuracy and robustness compared to single models, and can also be used to select good state samples for training.
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
(2023)
Article
Computer Science, Information Systems
Zifeng Gong, Bing He, Chen Hu, Xiaobo Zhang, Weijie Kang
Summary: This paper presents a new scheme for online solution of a networked multi-agent pursuit-evasion game based on an online adaptive dynamic programming method, considering relative distance and control energy to obtain policies for agents reaching Nash equilibrium using minmax principle.
Article
Automation & Control Systems
Bo Pang, Zhong-Ping Jiang
Summary: This article studies the infinite-horizon adaptive optimal control of continuous-time linear periodic systems and proposes a novel value iteration-based off-policy adaptive dynamic programming algorithm for a general class of systems. The algorithm is proven to uniformly converge to optimal solutions in both model-based and model-free cases, without assuming knowledge of an initial stabilizing controller. Application to a triple inverted pendulum demonstrates the feasibility and effectiveness of the proposed method.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
(2021)
Article
Automation & Control Systems
Chun Li, Jinliang Ding, Frank L. Lewis, Tianyou Chai
Summary: This paper introduces a novel formulation of the value function to address the optimal tracking problem of nonlinear discrete-time systems, successfully applied in adaptive dynamic programming algorithms. The optimal control policy can be deduced through this value function, demonstrating the optimality of the obtained control policy.
Article
Automation & Control Systems
Huaiyuan Jiang, Bin Zhou, Guang-Ren Duan
Summary: This article studies the general policy iteration (GPI) method for optimal control of discrete-time linear systems. The existing result on the GPI method is recalled and some new properties are proposed. A model-based modified GPI algorithm is proposed based on these new properties, with its convergence proof provided. In addition, a data-driven implementation for the proposed method is introduced, which does not require the use of system matrices. The proposed algorithm further relaxes the condition to initiate the GPI based algorithm compared to existing results. The effectiveness of the proposed modified GPI based algorithm is verified through a simulation example.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL
(2022)
Article
Automation & Control Systems
Geyang Xiao, Huaguang Zhang
Summary: This article focuses on the convergence property and error bounds analysis of value iteration adaptive dynamic programming for continuous-time nonlinear systems. It introduces a contraction assumption to describe the relationship between the total value function and the single integral step cost. The convergence property of value iteration is proved under an arbitrary positive semidefinite function as the initial condition. The article also considers the accumulated effects of approximation errors generated in each iteration and proposes an error bounds condition to ensure the convergence of the approximated iterative results.
IEEE TRANSACTIONS ON CYBERNETICS
(2023)
Article
Automation & Control Systems
Mingming Liang, Yonghua Wang, Derong Liu
Summary: In this study, a novel general impulsive transition matrix is defined to reveal the transition dynamics and probability distribution evolution patterns between impulsive events. Based on this matrix, policy iteration-based impulsive adaptive dynamic programming algorithms are developed to solve optimal impulsive control problems. The algorithms demonstrate convergence to the optimal impulsive performance index function and allow for optimization on computing devices with low memory spaces.
IEEE TRANSACTIONS ON CYBERNETICS
(2023)
Article
Automation & Control Systems
Min Lin, Yuanqing Xia, Zhongqi Sun, Li Dai
Summary: This paper proposes a novel learning-based model predictive control scheme that overcomes the challenge of manually designing terminal conditions and improves control performance. The scheme employs value iteration in reinforcement learning to autonomously design terminal cost while considering approximation errors. The paper provides theoretical analysis, including convergence, stability, and performance, and investigates the influence of prediction horizon and initial terminal cost on performance.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL
(2023)
Article
Automation & Control Systems
Huaiyuan Jiang, Bin Zhou
Summary: In this paper, a bias-policy iteration method is proposed for solving the data-driven optimal control problem of unknown continuous-time linear systems. The proposed method relaxes the condition of the initial admissible controllers by adding a bias parameter, and its effectiveness is verified through simulation examples.
Article
Computer Science, Artificial Intelligence
Huaiyuan Jiang, Bin Zhou, Guang-Ren Duan
Summary: In this article, the 1-policy iteration (1-PI) method for the optimal control problem of discrete-time linear systems is reconsidered and restated from a novel aspect. A modified 1-PI algorithm is introduced based on new properties of the traditional 1-PI, with its convergence proven. The data-driven implementation is constructed with a new matrix rank condition, and a simulation example verifies the effectiveness of the proposed method.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Automation & Control Systems
Mingming Ha, Ding Wang, Derong Liu
Summary: In this paper, a new approach is proposed to address the tracking control problem. By introducing a new cost function and a novel stability analysis method, the issue of incomplete elimination of tracking error in traditional approaches is solved. The specific implementation scheme for the special case of linear systems is also provided.
IEEE-CAA JOURNAL OF AUTOMATICA SINICA
(2022)
Article
Automation & Control Systems
Qinglai Wei, Tianmin Zhou, Jingwei Lu, Yu Liu, Shuai Su, Jun Xiao
Summary: In this article, a new stochastic adaptive dynamic programming (ADP) method is developed to solve the optimal control problem of continuous-time (CT) time-invariant nonlinear systems with stochastic nonlinear disturbances. The method simultaneously approximates the value function and the control law under the conditional expectation. The asymptotic stability of the closed-loop stochastic system in probability is analyzed using the stochastic Lyapunov direct method, and the convergence of the developed ADP method is proven. Four simulations are conducted to demonstrate the effectiveness of the proposed method.
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS
(2023)