☆ 4.7 Article

Online adaptive algorithm for optimal control with integral reinforcement learning

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL (2014)

期刊

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL

卷 24, 期 17, 页码 2686-2710

出版社

WILEY

DOI: 10.1002/rnc.3018

关键词

synchronous integral reinforcement learning; Hamilton-Jacobi-Bellman equation; persistence of excitation; approximated dynamic programming

类别

Automation & Control Systems Engineering, Electrical & Electronic Mathematics, Applied

资金

NSF [ECCS-1128050]
ARO [W91NF-05-1-0314]
AFOSR [FA9550-09-1-0278]

向作者/读者索取更多资源

Protocol

Reagent

摘要

In this paper, we introduce an online algorithm that uses integral reinforcement knowledge for learning the continuous-time optimal control solution for nonlinear systems with infinite horizon costs and partial knowledge of the system dynamics. This algorithm is a data-based approach to the solution of the Hamilton-Jacobi-Bellman equation, and it does not require explicit knowledge on the system's drift dynamics. A novel adaptive control algorithm is given that is based on policy iteration and implemented using an actor/critic structure having two adaptive approximator structures. Both actor and critic approximation networks are adapted simultaneously. A persistence of excitation condition is required to guarantee convergence of the critic to the actual optimal value function. Novel adaptive control tuning algorithms are given for both critic and actor networks, with extra terms in the actor tuning law being required to guarantee closed loop dynamical stability. The approximate convergence to the optimal controller is proven, and stability of the system is also guaranteed. Simulation examples support the theoretical result. Copyright (c) 2013 John Wiley & Sons, Ltd.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Automation & Control Systems

OPTIMAL BOUNDS FOR NUMERICAL APPROXIMATIONS OF INFINITE HORIZON PROBLEMS BASED ON DYNAMIC PROGRAMMING APPROACH

Javier de Frutos, Julia Novo

Summary: This paper provides error bounds for fully discrete approximations of infinite horizon problems using the dynamic programming approach. The paper revises the error bound of the fully discrete method and proves that, under assumptions similar to the time discrete case, the error of the fully discrete case is O(h+k), giving first order accuracy in time and space for the method. This error bound matches numerical experiments in the literature where the behavior predicted by the O(k/h) bound has not been observed.

SIAM JOURNAL ON CONTROL AND OPTIMIZATION (2023)

添加到收藏夹

Article Automation & Control Systems

Maximum Entropy Optimal Control of Continuous-Time Dynamical Systems

Jeongho Kim, Insoon Yang

Summary: Maximum entropy reinforcement learning methods have been successfully applied to a range of challenging sequential decision-making and control tasks. However, there is a need to extend these methods to continuous-time systems. This article studies the theory of maximum entropy optimal control in continuous time and derives a novel class of equations. The results demonstrate the performance of the maximum entropy method in continuous-time optimal control and reinforcement learning problems.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL (2023)

添加到收藏夹

Article Mathematics, Applied

HJB-RBF Based Approach for the Control of PDEs

Alessandro Alla, Hugo Oliveira, Gabriele Santin

Summary: In this study, a new method is proposed to solve infinite horizon optimal control problems using Shepard moving least squares approximation method and radial basis functions on scattered grids. A scattered mesh is generated through an optimization process and the shape parameter is selected to achieve problem localization and high-dimensional approximation. Error estimates for the value function are provided and numerical tests demonstrate the effectiveness of the proposed method.

JOURNAL OF SCIENTIFIC COMPUTING (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Control of an AUV with completely unknown dynamics and multi-asymmetric input constraints via off-policy reinforcement learning

Mehdi Mohammadi, Mohammad Mehdi Arefi, Navid Vafamand, Okyay Kaynak

Summary: This paper proposes a novel model-free optimal controller for nonlinear AUVs, utilizing an integral reinforcement learning strategy to address the optimal control problem for completely unknown dynamics, and employs a neural network structure for modeling and control implementation.

NEURAL COMPUTING & APPLICATIONS (2022)

添加到收藏夹

Article Automation & Control Systems

Optimal control of a growth/consumption model

Keyan Miao, Richard Vinter

Summary: This article discusses an optimal control problem in neo-classical macroeconomics, aiming to maximize expenditure on social programs by balancing investment for growth and consumption. A nonstandard verification technique is introduced and applied to handle singularities caused by fractional singularities, providing a detailed solution and analysis of the problem.

OPTIMAL CONTROL APPLICATIONS & METHODS (2021)

添加到收藏夹

Article Mathematics, Applied

TENSOR DECOMPOSITION METHODS FOR HIGH-DIMENSIONAL HAMILTON-JACOBI-BELLMAN EQUATIONS

Sergey Dolgov, Dante Kalise, Karl K. Kunisch

Summary: This study presents a tensor decomposition approach for high-dimensional fully nonlinear Hamilton-Jacobi-Bellman equations, which partially circumvents the curse of dimensionality and allows for polynomial scaling with respect to the dimension. Convergence analysis is provided for the linear-quadratic case, and the effectiveness of the method is evaluated in the optimal feedback stabilization of nonlinear dynamics with a hundred variables in Allen-Cahn and Fokker-Planck equations.

SIAM JOURNAL ON SCIENTIFIC COMPUTING (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A PDE-Based Method for Shape Registration

Esten Nicolai Woien, Markus Grasmair

Summary: This paper presents a new numerical method for solving nonconvex variational problems, including computing shape space distances and registering curves. The method has global convergence and improved efficiency.

SIAM JOURNAL ON IMAGING SCIENCES (2022)

添加到收藏夹

Article Engineering, Electrical & Electronic

Dynamic Event-Triggered Reinforcement Learning-Based Consensus Tracking of Nonlinear Multi-Agent Systems

Bo Xu, Yuan-Xin Li, Zhongsheng Hou, Choon Ki Ahn

Summary: In this paper, a novel approach is proposed to address the event-triggered optimized consensus tracking control problem in a class of uncertain nonlinear multi-agent systems (MASs). An adaptive reinforcement learning algorithm based on the actor-critic architecture and the backstepping method is utilized to optimize control performance. The proposed optimized controller employs a novel event-triggered strategy to dynamically adjust sampling errors online and reduce communication resource usage and computational complexity.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS (2023)

添加到收藏夹

Article Mathematics, Applied

Reinsurance Policy under Interest Force and Bankruptcy Prohibition

Yangmin Zhong, Huaping Huang

Summary: In this paper, we address the optimal reinsurance problem in mathematical finance by using stochastic control theory. By deriving the Hamilton-Jacobi-Bellman equation, we obtain an explicit solution for the value function and optimal policy. Through numerical examples, we identify key factors that influence the optimal reinsurance strategy.

AXIOMS (2023)

添加到收藏夹

Article Mathematics, Applied

Optimal Consumption, Investment, and Housing Choice: A Dynamic Programming Approach

Qi Li, Seryoong Ahn

Summary: This study investigates a portfolio selection problem that involves an agent's realistic housing service choice, including the size of the house to live in and the decision between renting and purchasing a house. Using a dynamic programming approach, a closed-form solution is derived to obtain optimal strategies for consumption, investment, housing service, and purchasing time. Various numerical demonstrations are presented to visually illustrate the economic implications of the model by showcasing the impacts of parameters in the financial and housing markets as well as the agent's preference. This model is significant as it is a pioneering model for optimal time to purchase a house, which has not been extensively studied in existing mathematical portfolio optimization models.

AXIOMS (2022)

添加到收藏夹

Article Automation & Control Systems

Continuous-Time Stochastic Policy Iteration of Adaptive Dynamic Programming

Qinglai Wei, Tianmin Zhou, Jingwei Lu, Yu Liu, Shuai Su, Jun Xiao

Summary: In this article, a new stochastic adaptive dynamic programming (ADP) method is developed to solve the optimal control problem of continuous-time (CT) time-invariant nonlinear systems with stochastic nonlinear disturbances. The method simultaneously approximates the value function and the control law under the conditional expectation. The asymptotic stability of the closed-loop stochastic system in probability is analyzed using the stochastic Lyapunov direct method, and the convergence of the developed ADP method is proven. Four simulations are conducted to demonstrate the effectiveness of the proposed method.

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS (2023)

添加到收藏夹

Article Environmental Sciences

Transboundary pollution control in asymmetric countries: do assistant investments help?

Lu Xiao, Ya Chen, Chaojie Wang, Jun Wang

Summary: This paper discusses the cooperation between asymmetric countries in transboundary pollution control, with a focus on the impact of assistant investments provided by developed countries. The study finds that the provision of assistant investments can reduce common pollution stock and increase economic benefits for both countries by raising equilibrium emission strategies.

ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH (2022)

添加到收藏夹

Article Mathematics, Applied

APPROXIMATING THE STATIONARY BELLMAN EQUATION BY HIERARCHICAL TENSOR PRODUCTS*

Mathias Oster, Leon Sallandt, Reinhold Schneider

Summary: This article tackles infinite horizon optimal control problems by solving the associated stationary Bellman equation numerically. The nonlinearity and high dimensionality of the Bellman equation are addressed using the Policy Iteration algorithm and low rank approximation methods such as the Koopman operator and tensor product approximation. The proposed approach has been successfully applied to control nonlinear parabolic partial differential equations.

JOURNAL OF COMPUTATIONAL MATHEMATICS (2023)

添加到收藏夹

Article Automation & Control Systems

Approximated multi-agent fitted Q iteration

Antoine Lesage-Landry, Duncan S. Callaway

Summary: We propose an efficient approximation method called Approximated Multi-Agent Fitted Q Iteration (AMAFQI) for multi-agent batch reinforcement learning. We provide a detailed derivation of our approach and demonstrate that it yields a greedy policy with respect to multiple approximations of the centralized Q-function. Compared to the commonly used Fitted Q Iteration (FQI) approach, AMAFQI requires a linear number of computations with respect to the number of agents, while FQI requires exponential computations. Numerical simulations show that AMAFQI achieves significant computation time reduction without sacrificing performance compared to FQI in multi-agent problems.

SYSTEMS & CONTROL LETTERS (2023)

添加到收藏夹

Article Mathematics, Applied

Optimal polynomial feedback laws for finite horizon control problems

Karl Kunisch, Donato Vasquez-Varas

Summary: This article analyzes a learning technique for finite horizon optimal control problems and its approximation based on polynomials, and illustrates the practicality and efficiency of the method.

COMPUTERS & MATHEMATICS WITH APPLICATIONS (2023)

添加到收藏夹

Article Mathematics, Applied

Structural inference of networked dynamical systems with universal differential equations

J. Koch, Z. Chen, A. Tuor, J. Drgona, D. Vrabie

Summary: This work aims to infer the intrinsic physics of a base unit, the underlying graphical structure between units, and the coupling physics of a networked dynamical system based on observed nodal states. These tasks are formulated using the Universal Differential Equation, approximating unknown systems with neural networks, known mathematical terms, or combinations of both. The value of these inference tasks is demonstrated through future state predictions and inference of system behavior on varied network topologies.

CHAOS (2023)

添加到收藏夹

Article Automation & Control Systems

Concurrent Receding Horizon Control and Estimation Against Stealthy Attacks

Filippos Fotiadis, Kyriakos. G. Vamvoudakis

Summary: This article examines a game-theoretic framework for cyber-physical systems, focusing on the interaction between a defender and an intelligent attacker. The defender aims to optimize a performance cost to enhance resilience against stealthy attacks, while the attacker seeks to disrupt the system's performance using its information advantage. Both players adopt receding horizon control principles to implement their policies, with the defender employing receding horizon estimation to overcome limited access to system state information. Theoretical analysis demonstrates that this concurrent policy ensures closed-loop boundedness, even in the presence of stealthy attacks and information disadvantage. Simulations provide further verification and clarification of these findings.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Intermittent Learning Through Operant Conditioning for Cyber-Physical Systems

Prachi Pratyusha Sahoo, Aris Kanellopoulos, Kyriakos G. Vamvoudakis

Summary: This article presents a novel intermittent learning scheme based on Skinner's operant conditioning techniques, which approximates the optimal policy while decreasing information transfer. Traditional reinforcement learning schemes can lead to overutilization of limited resources and face the risk of malicious interference. Simulation results demonstrate the efficacy of the proposed approach.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

添加到收藏夹

Article Construction & Building Technology

System modeling for grid-interactive efficient building applications

Yunyang Ye, Cary A. Faulkner, Rong Xu, Sen Huang, Yuan Liu, Draguna L. Vrabie, Jian Zhang, Wangda Zuo

Summary: Despite the existing research applications of system modeling in GEBs, the actual implementation is still at an early stage due to the lack of a comprehensive summary on how to establish system modeling for GEBs. This paper conducts an extensive survey of over 300 relevant journal articles to bridge this gap. The survey identifies key requirements of system modeling for GEBs based on different applications and analyzes the assumptions, modeling approaches, and simulation of these applications. The study also provides insights on the use of system modeling in GEBs and recommends directions for future research.

JOURNAL OF BUILDING ENGINEERING (2023)

添加到收藏夹

Article Construction & Building Technology

Simulation-based assessment of ASHRAE Guideline 36, considering energy performance, indoor air quality, and control stability

Cary A. Faulkner, Robert Lutes, Sen Huang, Wangda Zuo, Draguna L. Vrabie

Summary: This study evaluates American Society of Heating, Refrigerating and Air-Conditioning Engineers Guideline 36 (G36) using a typical medium office building. It considers the interaction between control sequences for water-side and air-side equipment and assesses the performance of G36 from the perspective of indoor air quality. It also examines the short-term behaviors of the building under G36 to understand its control stability.

BUILDING AND ENVIRONMENT (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Recursive Reasoning With Reduced Complexity and Intermittency for Nonequilibrium Learning in Stochastic Games

Filippos Fotiadis, Kyriakos G. Vamvoudakis

Summary: This article proposes an efficient approach for decision-making in nonequilibrium stochastic games by constructing models of bounded rationality and developing corresponding algorithms. The approach is demonstrated to be effective through simulations.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Online and Robust Intermittent Motion Planning in Dynamic and Changing Environments

Zirui Xu, George P. Kontoudis, Kyriakos G. Vamvoudakis

Summary: We propose RRT-Q(8)(X), an online and intermittent kinodynamic motion planning framework for dynamic environments with unknown robot dynamics and unknown disturbances. The framework leverages RRTX for global path planning and rapid replanning to generate waypoints as a sequence of boundary-value problems (BVPs). We introduce a robust intermittent Q-learning controller for waypoint navigation with completely unknown system dynamics, external disturbances, and intermittent control updates. We demonstrate the effectiveness of RRT-Q(8)(X) through Monte Carlo numerical experiments in various dynamic and changing environments.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

添加到收藏夹

Article Automation & Control Systems

Cooperative Finitely Excited Learning for Dynamical Games

Yongliang Yang, Hamidreza Modares, Kyriakos G. Vamvoudakis, Frank L. Lewis

Summary: In this article, a novel approach for enhancing the learning framework of zero-sum games with continuous time dynamics is proposed. This approach combines online recorded data with instantaneous data to improve efficiency, and replaces the classical persistent excitation condition with an easy-to-check cooperative excitation condition through experience replay and distributed interaction among agents. It guarantees the consensus of distributed actor-critic learning and ensures the stability of the equilibrium point and convergence to the Nash equilibrium. Simulation results demonstrate its effectiveness compared to previous methods.

IEEE TRANSACTIONS ON CYBERNETICS (2023)

添加到收藏夹

Article Computer Science, Software Engineering

Codesign for Extreme Heterogeneity: Integrating Custom Hardware With Commodity Computing Technology to Support Next-Generation HPC Converged Workloads

James A. Ang, Kevin J. Barker, Draguna L. Vrabie, Gokcen Kestor

Summary: The future of high-performance computing (HPC) will be driven by the convergence of physical simulation, artificial intelligence, machine learning, and data science computing capabilities. Emerging technologies will integrate commodity components with custom accelerators, resulting in a diverse ecosystem of industry technology developers and researchers.

IEEE INTERNET COMPUTING (2023)

添加到收藏夹

Article Automation & Control Systems

Online accelerated data-driven learning for optimal feedback control of discrete-time partially uncertain systems

Luke Somers, Wassim M. Haddad, Nick-Marios T. Kokolakis, Kyriakos G. Vamvoudakis

Summary: This paper presents an online learning algorithm for solving the Bellman equation in discrete-time nonlinear uncertain dynamical systems. Using an actor-critic structure based on higher-order tuner laws, our algorithm ensures accelerated learning in generating optimal control policies. The proposed online learning-based optimal control framework guarantees uniform ultimate boundedness of the closed-loop system under the assumption of system persistently excited. Two numerical examples are provided to demonstrate the efficacy of the proposed approach.

INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING (2023)

添加到收藏夹

Proceedings Paper Automation & Control Systems

PI-like Estimator-based Adaptive Extremum Seeking Control using Initial Excitation

Tushar Garg, Sayan Basu Roy, Kyriakos G. Vamvoudakis

Summary: In this paper, a novel proportional integral (PI)-like estimator-based adaptive extremum seeking control (AdESC) algorithm is proposed for online optimization, achieving parameter convergence under a relaxed mathematical condition called initial excitation (IE). The proposed AdESC algorithm utilizes a new set of low-pass filter dynamics, omitting the need for switching mechanism in past literature while still ensuring parameter convergence. A detailed Lyapunov analysis is carried out to establish closed-loop stability of the AdESC algorithm using a singular-perturbation like principle.

2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC) (2022)

添加到收藏夹

Proceedings Paper Engineering, Industrial

Zero-Sum Game (ZSG) based Integral Reinforcement Learning for Trajectory Tracking Control of Autonomous Smart Car

Seta Bogosyan, Metin Gokasan, Kyriakos G. Vamvoudakis

Summary: The aim of this research is to develop and implement continuous-time, online reinforcement learning schemes for trajectory tracking control of fully autonomous vehicles (AVs) in real-world scenarios. RL offers adaptive optimality and model-free nature, which is more promising than model-based methods like MPC against uncertainties related to the vehicle. The existing studies on RL based AV control mainly focus on high-level trajectory tracking and lack practical implementations. The ultimate goal is to fill the theoretical and practical gaps by designing and practically evaluating novel RL strategies to improve the performance of trajectory tracking control against uncertainties at all levels.

2022 IEEE 31ST INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE) (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Safety-Aware Pursuit-Evasion Games in Unknown Environments Using Gaussian Processes and Finite-Time Convergent Reinforcement Learning

Nikolaos-Marios T. Kokolakis, Kyriakos G. Vamvoudakis

Summary: This article presents a safe pursuit-evasion game that allows for finite-time capture, optimal performance, and adaptation to an unknown cluttered environment. By formulating the game as a zero-sum differential game, the pursuer aims to minimize its relative distance to the target while the evader tries to maximize it. A critic-only reinforcement learning algorithm is proposed for learning the pursuit-evasion policies online and in finite time, ensuring the finite-time capture of the evader. Safety is guaranteed through barrier functions integrated into the running cost, and Gaussian processes are used for safe learning of the unknown environment. Simulation results demonstrate the effectiveness of the approach.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Adaptive Neural Network Stochastic-Filter-Based Controller for Attitude Tracking With Disturbance Rejection

Hashim A. Hashim, Kyriakos G. Vamvoudakis

Summary: This article proposes a real-time neural network stochastic filter-based controller on SO(3) Lie group as a novel approach to the attitude tracking problem. The introduced solution consists of a filter and a controller. An adaptive NN-based stochastic filter is proposed to estimate attitude components and dynamics, accounting for measurement uncertainties. A novel control law on SO(3) is presented to address unknown disturbances. The proposed approach offers robust tracking performance by supplying the required control signal given data extracted from low-cost inertial measurement units.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

添加到收藏夹

Article Automation & Control Systems

Neural-Adaptive Stochastic Attitude Filter on SO(3)

Hashim A. Hashim, Mohammed Abouheaf, Kyriakos G. Vamvoudakis

Summary: This letter proposes a novel stochastic non-linear neural-adaptive-based filter on SO(3) for attitude estimation. The filter is shown to produce good results when using measurements from low-cost sensing units and is guaranteed to be almost semi-globally uniformly ultimately bounded in the mean square. The effectiveness of the proposed filter is tested and evaluated in its discrete form under the conditions of large initialization error and high measurement uncertainties.

IEEE CONTROL SYSTEMS LETTERS (2022)

添加到收藏夹

暂无数据

© Peeref 2019-2024. All rights reserved.