☆ 4.7 Article

Stability Analysis of Optimal Adaptive Control Using Value Iteration With Approximation Errors

IEEE TRANSACTIONS ON AUTOMATIC CONTROL (2018)

期刊

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

卷 63, 期 9, 页码 3119-3126

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TAC.2018.2790260

关键词

Adaptive dynamic programming; approximation error; stability analysis; value iteration

类别

Automation & Control Systems Engineering, Electrical & Electronic

资金

National Science Foundation [1509778]

向作者/读者索取更多资源

Protocol

Reagent

摘要

Effects of the presence of approximation errors are analyzed on the stability of adaptive optimal control using value iteration, initiated from a stabilizing control policy. This analysis includes the system operated using any single/constant resulting control policy and also using an evolving/time-varying control policy. Sufficient conditions on the 'per iteration' approximation errors are obtained for guaranteeing the stability. A feature of the presented results is providing estimations of the region of attraction, under the approximation errors, so that if the initial condition is within this region, the whole trajectory will remain inside the training region, and hence, the function approximation results remain reliable.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Automation & Control Systems

Hamiltonian-Driven Adaptive Dynamic Programming With Approximation Errors

Yongliang Yang, Hamidreza Modares, Kyriakos G. Vamvoudakis, Wei He, Cheng-Zhong Xu, Donald C. Wunsch

Summary: This article explores an iterative adaptive dynamic programming algorithm within the Hamiltonian-driven framework for solving the Hamilton-Jacobi-Bellman equation in continuous time for nonlinear systems. It introduces a novel function "min-Hamiltonian" and develops an iterative ADP algorithm that considers approximation errors during policy evaluation. The article also provides a sufficient condition on the iterative value gradient to ensure closed-loop stability and convergence to optimal value, as well as a model-free extension based on off-policy reinforcement learning.

IEEE TRANSACTIONS ON CYBERNETICS (2022)

添加到收藏夹

Article Automation & Control Systems

Adaptive polyhedral meshing for approximate dynamic programming in control

Antonio Sala, Leopoldo Armesto

Summary: This study introduces a new criterion for adaptive meshing in polyhedral partitions to interpolate value functions, employing an initial condition probability density function, uncertainty propagation, and temporal-difference error to determine the addition of new points. A collection of lemmas justifies the algorithmic proposal, with comparative analysis highlighting the advantages of this proposal over other options in literature. The developed methods are applied in simulation examples and an experimental robotic setup.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2022)

添加到收藏夹

Article Automation & Control Systems

Offline and Online Adaptive Critic Control Designs With Stability Guarantee Through Value Iteration

Mingming Ha, Ding Wang, Derong Liu

Summary: This article investigates the stability of the closed-loop system using various control policies generated by value iteration. It develops an offline integrated value iteration scheme with a stability guarantee and an online adaptive dynamic programming algorithm based on the concept of attraction domain. The theoretical and numerical results confirm the effectiveness of the algorithms in maintaining system stability.

IEEE TRANSACTIONS ON CYBERNETICS (2022)

添加到收藏夹

Article Management

Optimized ensemble value function approximation for dynamic programming

Cristiano Cervellera

Summary: Approximate dynamic programming (ADP) is a standard tool for solving multistage dynamic optimization problems. This paper investigates the use of ensemble learning in the ADP context to approximate the value function. The ensemble of value function approximations improves accuracy and robustness compared to single models, and can also be used to select good state samples for training.

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH (2023)

添加到收藏夹

Article Computer Science, Information Systems

Online Adaptive Dynamic Programming-Based Solution of Networked Multiple-Pursuer and Single-Evader Game

Zifeng Gong, Bing He, Chen Hu, Xiaobo Zhang, Weijie Kang

Summary: This paper presents a new scheme for online solution of a networked multi-agent pursuit-evasion game based on an online adaptive dynamic programming method, considering relative distance and control energy to obtain policies for agents reaching Nash equilibrium using minmax principle.

ELECTRONICS (2022)

添加到收藏夹

Article Automation & Control Systems

Adaptive Optimal Control of Linear Periodic Systems: An Off-Policy Value Iteration Approach

Bo Pang, Zhong-Ping Jiang

Summary: This article studies the infinite-horizon adaptive optimal control of continuous-time linear periodic systems and proposes a novel value iteration-based off-policy adaptive dynamic programming algorithm for a general class of systems. The algorithm is proven to uniformly converge to optimal solutions in both model-based and model-free cases, without assuming knowledge of an initial stabilizing controller. Application to a triple inverted pendulum demonstrates the feasibility and effectiveness of the proposed method.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL (2021)

添加到收藏夹

Article Automation & Control Systems

A novel adaptive dynamic programming based on tracking error for nonlinear discrete-time systems

Chun Li, Jinliang Ding, Frank L. Lewis, Tianyou Chai

Summary: This paper introduces a novel formulation of the value function to address the optimal tracking problem of nonlinear discrete-time systems, successfully applied in adaptive dynamic programming algorithms. The optimal control policy can be deduced through this value function, demonstrating the optimality of the obtained control policy.

AUTOMATICA (2021)

添加到收藏夹

Article Automation & Control Systems

Modified general policy iteration based adaptive dynamic programming for unknown discrete-time linear systems

Huaiyuan Jiang, Bin Zhou, Guang-Ren Duan

Summary: This article studies the general policy iteration (GPI) method for optimal control of discrete-time linear systems. The existing result on the GPI method is recalled and some new properties are proposed. A model-based modified GPI algorithm is proposed based on these new properties, with its convergence proof provided. In addition, a data-driven implementation for the proposed method is introduced, which does not require the use of system matrices. The proposed algorithm further relaxes the condition to initiate the GPI based algorithm compared to existing results. The effectiveness of the proposed modified GPI based algorithm is verified through a simulation example.

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL (2022)

添加到收藏夹

Article Automation & Control Systems

Convergence Analysis of Value Iteration Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems

Geyang Xiao, Huaguang Zhang

Summary: This article focuses on the convergence property and error bounds analysis of value iteration adaptive dynamic programming for continuous-time nonlinear systems. It introduces a contraction assumption to describe the relationship between the total value function and the single integral step cost. The convergence property of value iteration is proved under an arbitrary positive semidefinite function as the initial condition. The article also considers the accumulated effects of approximation errors generated in each iteration and proposes an error bounds condition to ensure the convergence of the approximated iterative results.

IEEE TRANSACTIONS ON CYBERNETICS (2023)

添加到收藏夹

Article Automation & Control Systems

An Efficient Impulsive Adaptive Dynamic Programming Algorithm for Stochastic Systems

Mingming Liang, Yonghua Wang, Derong Liu

Summary: In this study, a novel general impulsive transition matrix is defined to reveal the transition dynamics and probability distribution evolution patterns between impulsive events. Based on this matrix, policy iteration-based impulsive adaptive dynamic programming algorithms are developed to solve optimal impulsive control problems. The algorithms demonstrate convergence to the optimal impulsive performance index function and allow for optimization on computing devices with low memory spaces.

IEEE TRANSACTIONS ON CYBERNETICS (2023)

添加到收藏夹

Article Automation & Control Systems

Learning-based model predictive control under value iteration with finite approximation errors

Min Lin, Yuanqing Xia, Zhongqi Sun, Li Dai

Summary: This paper proposes a novel learning-based model predictive control scheme that overcomes the challenge of manually designing terminal conditions and improves control performance. The scheme employs value iteration in reinforcement learning to autonomously design terminal cost while considering approximation errors. The paper provides theoretical analysis, including convergence, stability, and performance, and investigates the influence of prediction horizon and initial terminal cost on performance.

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL (2023)

添加到收藏夹

Article Automation & Control Systems

Bias-policy iteration based adaptive dynamic programming for unknown continuous-time linear systems

Huaiyuan Jiang, Bin Zhou

Summary: In this paper, a bias-policy iteration method is proposed for solving the data-driven optimal control problem of unknown continuous-time linear systems. The proposed method relaxes the condition of the initial admissible controllers by adding a bias parameter, and its effectiveness is verified through simulation examples.

AUTOMATICA (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Modified λ-Policy Iteration Based Adaptive Dynamic Programming for Unknown Discrete-Time Linear Systems

Huaiyuan Jiang, Bin Zhou, Guang-Ren Duan

Summary: In this article, the 1-policy iteration (1-PI) method for the optimal control problem of discrete-time linear systems is reconsidered and restated from a novel aspect. A modified 1-PI algorithm is introduced based on new properties of the traditional 1-PI, with its convergence proven. The data-driven implementation is constructed with a new matrix rank condition, and a simulation example verifies the effectiveness of the proposed method.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

添加到收藏夹

Article Automation & Control Systems

Discounted Iterative Adaptive Critic Designs With Novel Stability Analysis for Tracking Control

Mingming Ha, Ding Wang, Derong Liu

Summary: In this paper, a new approach is proposed to address the tracking control problem. By introducing a new cost function and a novel stability analysis method, the issue of incomplete elimination of tracking error in traditional approaches is solved. The specific implementation scheme for the special case of linear systems is also provided.

IEEE-CAA JOURNAL OF AUTOMATICA SINICA (2022)

添加到收藏夹

Article Automation & Control Systems

Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming

Qinglai Wei, Tianmin Zhou, Jingwei Lu, Yu Liu, Shuai Su, Jun Xiao

Summary: In this article, a new stochastic adaptive dynamic programming (ADP) method is developed to solve the optimal control problem of continuous-time (CT) time-invariant nonlinear systems with stochastic nonlinear disturbances. The method simultaneously approximates the value function and the control law under the conditional expectation. The asymptotic stability of the closed-loop stochastic system in probability is analyzed using the stochastic Lyapunov direct method, and the convergence of the developed ADP method is proven. Four simulations are conducted to demonstrate the effectiveness of the proposed method.

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS (2023)

添加到收藏夹

暂无数据

暂无数据

© Peeref 2019-2024. All rights reserved.