4.7 Article

Online adaptive algorithm for optimal control with integral reinforcement learning

期刊

出版社

WILEY
DOI: 10.1002/rnc.3018

关键词

synchronous integral reinforcement learning; Hamilton-Jacobi-Bellman equation; persistence of excitation; approximated dynamic programming

资金

  1. NSF [ECCS-1128050]
  2. ARO [W91NF-05-1-0314]
  3. AFOSR [FA9550-09-1-0278]

向作者/读者索取更多资源

In this paper, we introduce an online algorithm that uses integral reinforcement knowledge for learning the continuous-time optimal control solution for nonlinear systems with infinite horizon costs and partial knowledge of the system dynamics. This algorithm is a data-based approach to the solution of the Hamilton-Jacobi-Bellman equation, and it does not require explicit knowledge on the system's drift dynamics. A novel adaptive control algorithm is given that is based on policy iteration and implemented using an actor/critic structure having two adaptive approximator structures. Both actor and critic approximation networks are adapted simultaneously. A persistence of excitation condition is required to guarantee convergence of the critic to the actual optimal value function. Novel adaptive control tuning algorithms are given for both critic and actor networks, with extra terms in the actor tuning law being required to guarantee closed loop dynamical stability. The approximate convergence to the optimal controller is proven, and stability of the system is also guaranteed. Simulation examples support the theoretical result. Copyright (c) 2013 John Wiley & Sons, Ltd.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Mathematics, Applied

Structural inference of networked dynamical systems with universal differential equations

J. Koch, Z. Chen, A. Tuor, J. Drgona, D. Vrabie

Summary: This work aims to infer the intrinsic physics of a base unit, the underlying graphical structure between units, and the coupling physics of a networked dynamical system based on observed nodal states. These tasks are formulated using the Universal Differential Equation, approximating unknown systems with neural networks, known mathematical terms, or combinations of both. The value of these inference tasks is demonstrated through future state predictions and inference of system behavior on varied network topologies.
Article Automation & Control Systems

Concurrent Receding Horizon Control and Estimation Against Stealthy Attacks

Filippos Fotiadis, Kyriakos. G. Vamvoudakis

Summary: This article examines a game-theoretic framework for cyber-physical systems, focusing on the interaction between a defender and an intelligent attacker. The defender aims to optimize a performance cost to enhance resilience against stealthy attacks, while the attacker seeks to disrupt the system's performance using its information advantage. Both players adopt receding horizon control principles to implement their policies, with the defender employing receding horizon estimation to overcome limited access to system state information. Theoretical analysis demonstrates that this concurrent policy ensures closed-loop boundedness, even in the presence of stealthy attacks and information disadvantage. Simulations provide further verification and clarification of these findings.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL (2023)

Article Computer Science, Artificial Intelligence

Intermittent Learning Through Operant Conditioning for Cyber-Physical Systems

Prachi Pratyusha Sahoo, Aris Kanellopoulos, Kyriakos G. Vamvoudakis

Summary: This article presents a novel intermittent learning scheme based on Skinner's operant conditioning techniques, which approximates the optimal policy while decreasing information transfer. Traditional reinforcement learning schemes can lead to overutilization of limited resources and face the risk of malicious interference. Simulation results demonstrate the efficacy of the proposed approach.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Construction & Building Technology

System modeling for grid-interactive efficient building applications

Yunyang Ye, Cary A. Faulkner, Rong Xu, Sen Huang, Yuan Liu, Draguna L. Vrabie, Jian Zhang, Wangda Zuo

Summary: Despite the existing research applications of system modeling in GEBs, the actual implementation is still at an early stage due to the lack of a comprehensive summary on how to establish system modeling for GEBs. This paper conducts an extensive survey of over 300 relevant journal articles to bridge this gap. The survey identifies key requirements of system modeling for GEBs based on different applications and analyzes the assumptions, modeling approaches, and simulation of these applications. The study also provides insights on the use of system modeling in GEBs and recommends directions for future research.

JOURNAL OF BUILDING ENGINEERING (2023)

Article Construction & Building Technology

Simulation-based assessment of ASHRAE Guideline 36, considering energy performance, indoor air quality, and control stability

Cary A. Faulkner, Robert Lutes, Sen Huang, Wangda Zuo, Draguna L. Vrabie

Summary: This study evaluates American Society of Heating, Refrigerating and Air-Conditioning Engineers Guideline 36 (G36) using a typical medium office building. It considers the interaction between control sequences for water-side and air-side equipment and assesses the performance of G36 from the perspective of indoor air quality. It also examines the short-term behaviors of the building under G36 to understand its control stability.

BUILDING AND ENVIRONMENT (2023)

Article Computer Science, Artificial Intelligence

Recursive Reasoning With Reduced Complexity and Intermittency for Nonequilibrium Learning in Stochastic Games

Filippos Fotiadis, Kyriakos G. Vamvoudakis

Summary: This article proposes an efficient approach for decision-making in nonequilibrium stochastic games by constructing models of bounded rationality and developing corresponding algorithms. The approach is demonstrated to be effective through simulations.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Online and Robust Intermittent Motion Planning in Dynamic and Changing Environments

Zirui Xu, George P. Kontoudis, Kyriakos G. Vamvoudakis

Summary: We propose RRT-Q(8)(X), an online and intermittent kinodynamic motion planning framework for dynamic environments with unknown robot dynamics and unknown disturbances. The framework leverages RRTX for global path planning and rapid replanning to generate waypoints as a sequence of boundary-value problems (BVPs). We introduce a robust intermittent Q-learning controller for waypoint navigation with completely unknown system dynamics, external disturbances, and intermittent control updates. We demonstrate the effectiveness of RRT-Q(8)(X) through Monte Carlo numerical experiments in various dynamic and changing environments.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Automation & Control Systems

Cooperative Finitely Excited Learning for Dynamical Games

Yongliang Yang, Hamidreza Modares, Kyriakos G. Vamvoudakis, Frank L. Lewis

Summary: In this article, a novel approach for enhancing the learning framework of zero-sum games with continuous time dynamics is proposed. This approach combines online recorded data with instantaneous data to improve efficiency, and replaces the classical persistent excitation condition with an easy-to-check cooperative excitation condition through experience replay and distributed interaction among agents. It guarantees the consensus of distributed actor-critic learning and ensures the stability of the equilibrium point and convergence to the Nash equilibrium. Simulation results demonstrate its effectiveness compared to previous methods.

IEEE TRANSACTIONS ON CYBERNETICS (2023)

Article Computer Science, Software Engineering

Codesign for Extreme Heterogeneity: Integrating Custom Hardware With Commodity Computing Technology to Support Next-Generation HPC Converged Workloads

James A. Ang, Kevin J. Barker, Draguna L. Vrabie, Gokcen Kestor

Summary: The future of high-performance computing (HPC) will be driven by the convergence of physical simulation, artificial intelligence, machine learning, and data science computing capabilities. Emerging technologies will integrate commodity components with custom accelerators, resulting in a diverse ecosystem of industry technology developers and researchers.

IEEE INTERNET COMPUTING (2023)

Article Automation & Control Systems

Online accelerated data-driven learning for optimal feedback control of discrete-time partially uncertain systems

Luke Somers, Wassim M. Haddad, Nick-Marios T. Kokolakis, Kyriakos G. Vamvoudakis

Summary: This paper presents an online learning algorithm for solving the Bellman equation in discrete-time nonlinear uncertain dynamical systems. Using an actor-critic structure based on higher-order tuner laws, our algorithm ensures accelerated learning in generating optimal control policies. The proposed online learning-based optimal control framework guarantees uniform ultimate boundedness of the closed-loop system under the assumption of system persistently excited. Two numerical examples are provided to demonstrate the efficacy of the proposed approach.

INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING (2023)

Proceedings Paper Automation & Control Systems

PI-like Estimator-based Adaptive Extremum Seeking Control using Initial Excitation

Tushar Garg, Sayan Basu Roy, Kyriakos G. Vamvoudakis

Summary: In this paper, a novel proportional integral (PI)-like estimator-based adaptive extremum seeking control (AdESC) algorithm is proposed for online optimization, achieving parameter convergence under a relaxed mathematical condition called initial excitation (IE). The proposed AdESC algorithm utilizes a new set of low-pass filter dynamics, omitting the need for switching mechanism in past literature while still ensuring parameter convergence. A detailed Lyapunov analysis is carried out to establish closed-loop stability of the AdESC algorithm using a singular-perturbation like principle.

2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC) (2022)

Proceedings Paper Engineering, Industrial

Zero-Sum Game (ZSG) based Integral Reinforcement Learning for Trajectory Tracking Control of Autonomous Smart Car

Seta Bogosyan, Metin Gokasan, Kyriakos G. Vamvoudakis

Summary: The aim of this research is to develop and implement continuous-time, online reinforcement learning schemes for trajectory tracking control of fully autonomous vehicles (AVs) in real-world scenarios. RL offers adaptive optimality and model-free nature, which is more promising than model-based methods like MPC against uncertainties related to the vehicle. The existing studies on RL based AV control mainly focus on high-level trajectory tracking and lack practical implementations. The ultimate goal is to fill the theoretical and practical gaps by designing and practically evaluating novel RL strategies to improve the performance of trajectory tracking control against uncertainties at all levels.

2022 IEEE 31ST INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE) (2022)

Article Computer Science, Artificial Intelligence

Safety-Aware Pursuit-Evasion Games in Unknown Environments Using Gaussian Processes and Finite-Time Convergent Reinforcement Learning

Nikolaos-Marios T. Kokolakis, Kyriakos G. Vamvoudakis

Summary: This article presents a safe pursuit-evasion game that allows for finite-time capture, optimal performance, and adaptation to an unknown cluttered environment. By formulating the game as a zero-sum differential game, the pursuer aims to minimize its relative distance to the target while the evader tries to maximize it. A critic-only reinforcement learning algorithm is proposed for learning the pursuit-evasion policies online and in finite time, ensuring the finite-time capture of the evader. Safety is guaranteed through barrier functions integrated into the running cost, and Gaussian processes are used for safe learning of the unknown environment. Simulation results demonstrate the effectiveness of the approach.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

Article Computer Science, Artificial Intelligence

Adaptive Neural Network Stochastic-Filter-Based Controller for Attitude Tracking With Disturbance Rejection

Hashim A. Hashim, Kyriakos G. Vamvoudakis

Summary: This article proposes a real-time neural network stochastic filter-based controller on SO(3) Lie group as a novel approach to the attitude tracking problem. The introduced solution consists of a filter and a controller. An adaptive NN-based stochastic filter is proposed to estimate attitude components and dynamics, accounting for measurement uncertainties. A novel control law on SO(3) is presented to address unknown disturbances. The proposed approach offers robust tracking performance by supplying the required control signal given data extracted from low-cost inertial measurement units.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

Article Automation & Control Systems

Neural-Adaptive Stochastic Attitude Filter on SO(3)

Hashim A. Hashim, Mohammed Abouheaf, Kyriakos G. Vamvoudakis

Summary: This letter proposes a novel stochastic non-linear neural-adaptive-based filter on SO(3) for attitude estimation. The filter is shown to produce good results when using measurements from low-cost sensing units and is guaranteed to be almost semi-globally uniformly ultimately bounded in the mean square. The effectiveness of the proposed filter is tested and evaluated in its discrete form under the conditions of large initialization error and high measurement uncertainties.

IEEE CONTROL SYSTEMS LETTERS (2022)

暂无数据