☆ 4.7 Article

Multi-Agent Deep Reinforcement Learning for Urban Traffic Light Control in Vehicular Networks

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY (2020)

Journal

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY

Volume 69, Issue 8, Pages 8243-8256

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TVT.2020.2997896

Keywords

Traffic light control; deep reinforcement learning; deep deterministic policy gradient algorithm; markov decision process; vehicular network

Funding

National Natural Science Foundation of China [61972448, 61872150, 61872049, 61902445]
Guangdong Basic and Applied Basic Research Foundation [2020A1515011209]
Fundamental Research Funds for the Central Universities of China [19lgpy222]
Natural Science Foundation of Guangdong Province of China [2019A1515011798]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

As urban traffic condition is diverse and complicated, applying reinforcement learning to reduce traffic congestion becomes one of the hot and promising topics. Especially, how to coordinate the traffic light controllers of multiple intersections is a key challenge for multi-agent reinforcement learning (MARL). Most existing MARL studies are based on traditional Q-learning, but unstable environment leads to poor learning in the complicated and dynamic traffic scenarios. In this paper, we propose a novel multi-agent recurrent deep deterministic policy gradient (MARDDPG) algorithm based on deep deterministic policy gradient (DDPG) algorithm for traffic light control (TLC) in vehiclar networks. Specifically, the centralized learning in each critic network enables each agent to estimate the policies of other agents in the decision-making process and each agent can coordinate with each other, alleviating the problem of poor learning performance caused by environmental instability. The decentralized execution enables each agent to make decisions independently. We share parameters in actor networks to speed up the training process and reduce the memory footprint. The addition of LSTM is beneficial to alleviate the instability of the environment caused by partial observable state. We utilize surveillance cameras and vehicular networks to collect status information for each intersection. Unlike previous work, we have not only considered the vehicle but also considered the pedestrians waiting to pass through the intersection. Moreover, we also set different priorities for buses and ordinary vehicles. The experimental results in a vehicular network show that our method can run stably in various scenarios and coordinate multiple intersections, which significantly reduces vehicle congestion and pedestrian congestion.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Robotic Visual-Inertial Calibration via Deep Deterministic Policy Gradient Learning

Wenxing Zhu, Lihui Wang, Liangliang Chen, Ninghui Xu, Yuzuwei Su

Summary: This research proposes a visual-inertial calibration method using deep deterministic policy gradient learning. By analyzing nonlinear observability and establishing a relationship model, it achieves the self-calibration process of visual-inertial systems, and solves the problems of hyperparameter training and network instability through a reinforcement learning network model.

IEEE SENSORS JOURNAL (2022)