4.7 Article

Multi-Agent Deep Reinforcement Learning for Urban Traffic Light Control in Vehicular Networks

Journal

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY
Volume 69, Issue 8, Pages 8243-8256

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TVT.2020.2997896

Keywords

Traffic light control; deep reinforcement learning; deep deterministic policy gradient algorithm; markov decision process; vehicular network

Funding

  1. National Natural Science Foundation of China [61972448, 61872150, 61872049, 61902445]
  2. Guangdong Basic and Applied Basic Research Foundation [2020A1515011209]
  3. Fundamental Research Funds for the Central Universities of China [19lgpy222]
  4. Natural Science Foundation of Guangdong Province of China [2019A1515011798]

Ask authors/readers for more resources

As urban traffic condition is diverse and complicated, applying reinforcement learning to reduce traffic congestion becomes one of the hot and promising topics. Especially, how to coordinate the traffic light controllers of multiple intersections is a key challenge for multi-agent reinforcement learning (MARL). Most existing MARL studies are based on traditional Q-learning, but unstable environment leads to poor learning in the complicated and dynamic traffic scenarios. In this paper, we propose a novel multi-agent recurrent deep deterministic policy gradient (MARDDPG) algorithm based on deep deterministic policy gradient (DDPG) algorithm for traffic light control (TLC) in vehiclar networks. Specifically, the centralized learning in each critic network enables each agent to estimate the policies of other agents in the decision-making process and each agent can coordinate with each other, alleviating the problem of poor learning performance caused by environmental instability. The decentralized execution enables each agent to make decisions independently. We share parameters in actor networks to speed up the training process and reduce the memory footprint. The addition of LSTM is beneficial to alleviate the instability of the environment caused by partial observable state. We utilize surveillance cameras and vehicular networks to collect status information for each intersection. Unlike previous work, we have not only considered the vehicle but also considered the pedestrians waiting to pass through the intersection. Moreover, we also set different priorities for buses and ordinary vehicles. The experimental results in a vehicular network show that our method can run stably in various scenarios and coordinate multiple intersections, which significantly reduces vehicle congestion and pedestrian congestion.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Automation & Control Systems

A Light-Weight Statistical Latency Measurement Platform at Scale

Xu Zhang, Geyong Min, Qilin Fan, Hao Yin, Dapeng Wu, Zhan Ma

Summary: This article introduces a lightweight statistical latency measurement platform called DMS, which predicts end-to-end latency between hosts by introducing a metric space and measuring latency between DNS servers. DMS achieves low measurement cost and good scalability by clustering hosts with DNS infrastructure.

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS (2022)

Article Biochemical Research Methods

Deep Learning in Drug Design: Protein-Ligand Binding Affinity Prediction

Mohammad A. Rezaei, Yanjun Li, Dapeng Wu, Xiaolin Li, Chenglong Li

Summary: This paper introduces a data-driven framework called DeepAtom for accurately predicting protein-ligand binding affinity. By utilizing a 3D Convolutional Neural Network (3D-CNN) architecture, DeepAtom can automatically extract atomic interaction patterns related to binding. Experiment results demonstrate that the DeepAtom approach outperforms other methods in baseline scoring and can potentially be adopted in computational drug development protocols.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2022)

Article Computer Science, Hardware & Architecture

An Adaptive Robustness Evolution Algorithm With Self-Competition and Its 3D Deployment for Internet of Things

Ning Chen, Tie Qiu, Zilong Lu, Dapeng Oliver Wu

Summary: The Internet of Things consists of numerous sensing nodes forming a large scale-free network. Optimizing network topology to increase resistance against malicious attacks is complex. Traditional genetic algorithms lack global search ability and may lead to premature convergence, slowing down population evolution. Therefore, an Adaptive Robustness Evolution Algorithm (AREA) with self-competition mechanism is proposed to address this issue.

IEEE-ACM TRANSACTIONS ON NETWORKING (2022)

Article Computer Science, Artificial Intelligence

Deep anomaly detection in packet payload

Jiaxin Liu, Xucheng Song, Yingjie Zhou, Xi Peng, Yanru Zhang, Pei Liu, Dapeng Wu, Ce Zhu

Summary: With the wide deployment of edge devices, it is important and challenging to detect packet payload anomalies for the safe and efficient operations of edge applications. However, existing approaches have limitations in detecting anomalies with long-term dependency relationships and rely on in-depth expert knowledge. To overcome these limitations, a deep learning-based framework is proposed, which consists of a block sequence construction method and a detection model based on LSTM, CNN, and Multi-head Self Attention. Experimental results show that the proposed model achieves a higher detection rate and a lower false positive rate compared to traditional and state-of-the-art methods.

NEUROCOMPUTING (2022)

Article Computer Science, Information Systems

Driving Maneuver Anomaly Detection Based on Deep Auto-Encoder and Geographical Partitioning

Miaomiao Liu, Kang Yang, Yanjie Fu, Dapeng Wu, Wan Du

Summary: This paper proposes GeoDMA, which utilizes GPS data from multiple vehicles to detect anomalous driving maneuvers. The approach includes designing an unsupervised deep auto-encoder to learn unique features from normal GPS data, developing a geographical partitioning algorithm to incorporate peer dependency of drivers, and training specific driving anomaly models for each sub-region. The experimental results show that GeoDMA outperforms baseline methods with up to 8.5% higher detection accuracy.

ACM TRANSACTIONS ON SENSOR NETWORKS (2023)

Article Computer Science, Artificial Intelligence

Dynamic convolutional capsule network for In-loop filtering in HEVC video codec

LiChao Su, Mengqing Cao, Yue Yu, Jian Chen, XiuZhi Yang, Dapeng Wu

Summary: This paper proposes an in-loop filtering algorithm based on a dynamic convolutional capsule network, which adapt well to local features and improves the efficiency of video coding, and shows outstanding performance in terms of time efficiency.

IET IMAGE PROCESSING (2023)

Article Engineering, Electrical & Electronic

Deep Reinforcement Learning Based Resource Allocation in Multi-UAV-Aided MEC Networks

Jingxuan Chen, Xianbin Cao, Peng Yang, Meng Xiao, Siqiao Ren, Zhongliang Zhao, Dapeng Oliver Wu

Summary: This paper addresses the resource allocation problem in a multi-UAV-aided uplink communication scenario, aiming to minimize the total system latency and energy consumption while satisfying constraints on transmit power and system latency caused by transmission and computation. The proposed UMAP algorithm optimizes UAV movement, MU association, and MU power control iteratively. Simulation results demonstrate that the UMAP algorithm effectively reduces system latency and energy consumption, and improves coverage rate compared to benchmark algorithms.

IEEE TRANSACTIONS ON COMMUNICATIONS (2023)

Article Computer Science, Information Systems

BLS-Location: A Wireless Fingerprint Localization Algorithm Based on Broad Learning

Xiaoqiang Zhu, Tie Qiu, Wenyu Qu, Xiaobo Zhou, Mohammed Atiquzzaman, Dapeng Oliver Wu

Summary: This paper presents a novel indoor wireless fingerprint localization algorithm based on a broad learning system, which utilizes channel state information to overcome the problems of data loss, noise interference, and time-consuming offline training. Experimental results show that the algorithm outperforms several machine learning algorithms and existing methods in terms of training time reduction and accuracy.

IEEE TRANSACTIONS ON MOBILE COMPUTING (2023)

Article Computer Science, Artificial Intelligence

Interactive reinforced feature selection with traverse strategy

Kunpeng Liu, Dongjie Wang, Wan Du, Dapeng Oliver Wu, Yanjie Fu

Summary: In this paper, a single-agent Monte Carlo-based reinforced feature selection method is proposed, along with two efficiency improvement strategies: early stopping strategy and reward-level interactive strategy. The proposed method aims to find the optimal feature subset for a given machine learning task by traversing the feature set and selecting features one by one. Additionally, the early stopping strategy and reward-level interactive strategy are introduced to enhance the training efficiency.

KNOWLEDGE AND INFORMATION SYSTEMS (2023)

Article Computer Science, Hardware & Architecture

Accurate Prediction of Required Virtual Resources via Deep Reinforcement Learning

Haojun Huang, Zhaoxi Li, Jialin Tian, Geyong Min, Wang Miao, Dapeng Oliver Wu

Summary: This paper proposes an approach for accurately predicting the required virtual resources using deep reinforcement learning. By leveraging the inherent features hidden in network traffic and consolidating high-dimensional resources into a standardized value, the approach minimizes prediction errors through DRL-based matrix factorization. Experimental results demonstrate the accurate prediction of required virtual resources.

IEEE-ACM TRANSACTIONS ON NETWORKING (2023)

Article Computer Science, Artificial Intelligence

Collaborative and Multilevel Feature Selection Network for Action Recognition

Zhenxing Zheng, Gaoyun An, Shan Cao, Dapeng Wu, Qiuqi Ruan

Summary: In this article, a novel collaborative and multilevel feature selection network (FSNet) is proposed for action recognition. FSNet can adaptively aggregate multilevel features into a new informative feature from both position and channel dimensions, enhancing the representational ability of existing networks.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Telecommunications

Energy-Delay Tradeoff for Dynamic Trajectory Planning in Priority-Oriented UAV-Aided IoT Networks

Hailin Cao, Wang Zhu, Zhengchuan Chen, Zhiwei Sun, Dapeng Oliver Wu

Summary: In this paper, we investigate priority-oriented UAV-aided time-sensitive data collection problems in an IoT network with movable SNs. We propose a novel autofocusing heuristic trajectory planning algorithm based on reinforcement learning (AHTP-RL) to minimize the energy consumed by a UAV and the average delay of different SNs through optimizing the trajectory of the UAV. Extensive simulations results demonstrate that the proposed AHTP-RL algorithm can achieve a superior balance between the communication delay and energy consumption.

IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING (2023)

Article Telecommunications

Hybrid Deployment of Multi-Hierarchical RISs for Probabilistic LoS Communication in Dense Urban

Zhong Tian, Jing Wang, Zhengchuan Chen, Min Wang, Yunjian Jia, Dapeng O. Wu

Summary: This letter proposes a hybrid deployment scheme of terrestrial and aerial reconfigurable intelligent surfaces (RISs) to overcome line-of-sight (LoS) communication obstruction in dense urban areas. The objective is to maximize the minimum average rate of users in different blind zones by jointly optimizing transmit beamforming at the base station, the reflecting coefficients of RISs, and the height deployment of the aerial RIS. An effective iterative method is proposed to solve the non-convex problem. Numerical simulations demonstrate the effectiveness of joint optimization on the passive beamforming of multi-hierarchical RISs and the height deployment of the aerial RIS, showing clear performance improvement over benchmark schemes.

IEEE COMMUNICATIONS LETTERS (2023)

Article Computer Science, Information Systems

Communication-Efficient and Attack-Resistant Federated Edge Learning With Dataset Distillation

Yanlin Zhou, Xiyao Ma, Dapeng Wu, Xiaolin Li

Summary: Federated Edge Learning is important for the development of cloud computing. However, current algorithms have high communication costs, while the proposed Distilled One-Shot Federated Learning method reduces the cost significantly while maintaining high performance.

IEEE TRANSACTIONS ON CLOUD COMPUTING (2023)

Article Computer Science, Artificial Intelligence

Unbalanced Incomplete Multi-View Clustering Via the Scheme of View Evolution: Weak Views are Meat; Strong Views Do Eat

Xiang Fang, Yuchong Hu, Pan Zhou, Dapeng Oliver Wu

Summary: The paper proposes a novel Unbalanced Incomplete Multi-view Clustering method (UIMC) based on view evolution, which effectively addresses the issue of unbalanced incompleteness among different views through weighted multi-view subspace clustering and low-rank representation design.

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE (2022)

No Data Available