Article
Automation & Control Systems
Xuewen Zhang, Hao Shen, Feng Li, Jing Wang
Summary: This article focuses on the non-zero-sum games problem in discrete-time Markov jump systems. It proposes a model-based algorithm and an off-policy reinforcement learning algorithm to obtain optimal control policies without relying on system dynamics information.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL
(2023)
Article
Mathematics, Applied
Xilin Xin, Yidong Tu, Vladimir Stojanovic, Hai Wang, Kaibo Shi, Shuping He, Tianhong Pan
Summary: This paper proposes a novel online mode-free integral reinforcement learning algorithm to solve multiplayer non-zero sum games. By collecting and learning subsystem information of states and inputs, and using online learning to compute corresponding N-coupled algebraic Riccati equations, the policy iterative algorithm presented in this paper can solve the coupled algebraic Riccati equations of multiplayer non-zero sum games. The effectiveness and feasibility of the design method is verified through a simulation example involving three players.
APPLIED MATHEMATICS AND COMPUTATION
(2022)
Article
Automation & Control Systems
Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar
Summary: This article investigates the problem of two-player zero-sum games and proposes a technique of successive relaxation to compute the min-max value faster. A generalized minimax Q-learning algorithm is also derived for finding the optimal policy when the model information is unknown.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
(2022)
Article
Automation & Control Systems
Lijing Zhai, Kyriakos G. Vamvoudakis
Summary: This article presents a data-based and private learning framework for detecting and mitigating replay attacks in cyber-physical systems. Optimal watermarking signals and a level of differential privacy have been added to improve the capability against replay attacks. By using data-based techniques, the best defending strategy has been learned, and a Neyman-Pearson detector has been proposed to identify replay attacks. Simulation results demonstrate the effectiveness of the approach and compare the data-based technique with a model-based one.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL
(2021)
Article
Engineering, Mechanical
Yu Huo, Ding Wang, Junfei Qiao, Menghua Li
Summary: This paper proposes a novel optimal control scheme based on the adaptive critic technology to solve the multi-player zero-sum game issue of continuous-time nonlinear systems with control constraints and unknown dynamics. A neural network-based identifier is used to reconstruct the unknown system dynamics, and a new nonquadratic function is developed to derive the associated Hamilton-Jacobi-Isaacs equation of the constrained game. An adaptive critic framework is then constructed to approximate the optimal cost function and estimate the optimal control strategy sets and worst disturbance. Theoretical analysis using Lyapunov stability theorem proves the uniform ultimate boundedness stability of the system state and the critic network weight approximation error. A representative example is simulated to validate the efficacy of the proposed framework.
NONLINEAR DYNAMICS
(2023)
Article
Computer Science, Artificial Intelligence
Dawen Wu, Abdel Lisser
Summary: This paper tackles a stochastic two-player zero-sum Nash game problem by modeling it as a dynamical neural network (DNN), showing that the DNN method has advantages in converging to better optimal points and solving large-scale problems.
Article
Management
Steve Alpern, Thuy Bui, Thomas Lidbetter, Katerina Papadaki
Summary: This study focuses on a patrolling game played on a network, aiming to model the problem of protecting roads or pipelines from adversarial attacks. The results provide solutions to the game for different network structures and attack durations.
OPERATIONS RESEARCH
(2022)
Article
Computer Science, Artificial Intelligence
Yuanheng Zhu, Dongbin Zhao
Summary: This paper combines game theory, dynamic programming, and recent deep reinforcement learning techniques to online learn the Nash equilibrium policy for two-player zero-sum Markov games. By formulating the problem as a Bellman minimax equation and applying generalized policy iteration, the authors propose a learning algorithm that utilizes neural networks to approximate Q functions. The algorithm is proven to have convergence and is validated through experiments on different examples.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2022)
Article
Automation & Control Systems
Samir Aberkane, Vasile Dragan
Summary: In this paper, we address a linear quadratic mean-field game problem with a leader-follower structure. We show how to obtain a state-feedback representation of the strategies achieving an open-loop Stackelberg equilibrium using a Riccati-type approach. We also establish the necessary and sufficient conditions for the solvability of the coupled generalized Riccati equations involved.
Article
Computer Science, Information Systems
Jingwei Lu, Qinglai Wei, Ziyang Wang, Tianmin Zhou, Fei-Yue Wang
Summary: This paper introduces a novel event-triggered optimal control method for discrete-time multi-player non-zero-sum games. By combining event-triggered algorithm with parallel control, the system's asymptotic stability can be achieved and an upper bound for the sum of all players' actual performance indices can be determined in advance.
INFORMATION SCIENCES
(2022)
Review
Computer Science, Information Systems
Joaquim Gabarro, Alan Stewart
Summary: The paper presents a survey of joint research work on uncertain systems with a focus on the behavior of large web applications under external attacks. Uncertain, multi-component systems can be modeled by orchestrations which call multiple web services and coordinate their responses. Uncertainty profiles are used to evaluate system behavior by providing a blurred snapshot of operating conditions.
COMPUTER SCIENCE REVIEW
(2021)
Article
Automation & Control Systems
Kaiqing Zhang, Sham M. Kakade, Tamer Basar, Lin F. Yang
Summary: This paper investigates the sample complexity of model-based reinforcement learning in multi-agent settings. By studying discounted zero-sum Markov games with two players, the paper shows the sample complexity of model-based MARL in finding the Nash equilibrium value and ε-NE policies, and compares it with reward-aware algorithms.
JOURNAL OF MACHINE LEARNING RESEARCH
(2023)
Article
Computer Science, Artificial Intelligence
Shunchao Zhang, Bo Zhao, Derong Liu, Yongwei Zhang
Summary: This paper investigates an event-triggered control method based on adaptive dynamic programming to solve zero-sum game problems in unknown multi-player continuous-time nonlinear systems. By constructing a neural network observer to identify system dynamics and solving the ZSG problem using a critic NN, a triggering scheme is developed to update control and disturbance laws, ultimately proving the effectiveness of the proposed method.
Article
Automation & Control Systems
Dawen Wu, Abdel Lisser
Summary: In this paper, we propose a novel deep learning approach that combines neurodynamic optimization and deep neural networks to predict saddle points in stochastic two-player zero-sum games. We model the game as an ODE system using neurodynamic optimization and develop a neural network to approximate the solution, including the prediction of the saddle point. A specialized algorithm is introduced to enhance the accuracy of the saddle point prediction. Experimental results demonstrate that our model outperforms existing approaches in terms of convergence speed and accuracy of saddle point predictions.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Article
Engineering, Marine
Gaofeng Che
Summary: This work proposes a new tracking control scheme for underactuated autonomous underwater vehicles (UAUVs) with unknown disturbance. By constructing an error tracking system and designing the online policy iteration algorithm, the proposed method achieves near-optimal control performance, improves the convergence speed of tracking error, and guarantees the stability of the system.