☆ 4.6 Article Proceedings Paper

Distributed Wildfire Surveillance with Autonomous Aircraft Using Deep Reinforcement Learning

JOURNAL OF GUIDANCE CONTROL AND DYNAMICS (2019)

Journal

JOURNAL OF GUIDANCE CONTROL AND DYNAMICS

Volume 42, Issue 8, Pages 1768-1778

Publisher

AMER INST AERONAUTICS ASTRONAUTICS

DOI: 10.2514/1.G004106

Keywords

-

Categories

Engineering, Aerospace Instruments & Instrumentation

Funding

National Science Foundation Graduate Research Fellowship [DGE1656518]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Teams of autonomous unmanned aircraft can be used to monitor wildfires, enabling firefighters to make informed decisions. However, controlling multiple autonomous fixed-wing aircraft to maximize forest fire coverage is a complex problem. The state space is high dimensional, the fire propagates stochastically, the sensor information is imperfect, and the aircraft must coordinate with each other to accomplish their mission. This work presents two deep reinforcement learning approaches for training decentralized controllers that accommodate the high dimensionality and uncertainty inherent in the problem. The first approach controls the aircraft using immediate observations of the individual aircraft. The second approach allows aircraft to collaborate on a map of the wildfire's state and maintain a time history of locations visited, which are used as inputs to the controller. Simulation results show that both approaches allow the aircraft to accurately track wildfire expansions and outperform an online receding-horizon controller. Additional simulations demonstrate that the approach scales with different numbers of aircraft and generalizes to different wildfire shapes.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Engineering, Electrical & Electronic

Vision-Based Autonomous Driving: A Hierarchical Reinforcement Learning Approach

Jiao Wang, Haoyi Sun, Can Zhu

Summary: This paper proposes an elaborate modular pipeline for autonomous driving that effectively integrates semantic perception information, multi-level decision tasks, and control modules. The proposed framework exhibits smooth and effective driving strategies in different environments and improves learning efficiency.

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY (2023)

Add to Collection

Article Robotics

High-Speed Autonomous Racing Using Trajectory-Aided Deep Reinforcement Learning

Benjamin David Evans, Herman Arnold Engelbrecht, Hendrik Willem Jordaan

Summary: The classical method of autonomous racing relies on real-time localization and a precalculated optimal trajectory, while end-to-end deep reinforcement learning (DRL) can train agents to race using only raw LiDAR scans. This study introduces trajectory-aided learning (TAL), which incorporates the optimal trajectory into the DRL training process to enable high-performance racing. The evaluation results show that TAL achieves significantly higher lap completion rates at high speeds compared to the baseline, by training the agent to select feasible speed profiles and track the optimal trajectory.

IEEE ROBOTICS AND AUTOMATION LETTERS (2023)

Add to Collection

Article Engineering, Aerospace

A Policy-Reuse Algorithm Based on Destination Position Prediction for Aircraft Guidance Using Deep Reinforcement Learning

Zhuang Wang, Yi Ai, Qinghai Zuo, Shaowu Zhou, Hui Li

Summary: This article proposes a policy-reuse algorithm based on destination position prediction to improve the training efficiency of aircraft guidance agents. By optimizing the reward function and transforming the problem into a fixed-position destination scenario, the method significantly improves training effectiveness and demonstrates stable performance in different tasks.

AEROSPACE (2022)

Add to Collection

Article Computer Science, Interdisciplinary Applications

Optimization of autonomous vehicle speed control mechanisms using hybrid DDPG-SHAP-DRL-stochastic algorithm

C. V. S. R. Syavasya, A. Lakshmi Muddana

Summary: Autonomous Vehicles (AV) are the future milestones of the automobile industry, functioning without human intervention. This research introduces a novel hybrid algorithm to address the challenges associated with autonomous driving in complex scenarios, utilizing a simulative environment for training and validation.

ADVANCES IN ENGINEERING SOFTWARE (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Enhanced decision making in multi-scenarios for autonomous vehicles using alternative bidirectional Q network

Mohamed Saber Rais, Khouloud Zouaidia, Rachid Boudour

Summary: To enhance decision making in autonomous vehicles, learning approaches, particularly reinforcement learning, have been adopted. However, there are limitations in the current algorithms, such as convergence rate, stability, handling multiple environments, and algorithm complexity. To address these issues, a novel extension of deep Q network called alternative bidirectional Q network is proposed. It aims to improve stability, performance, exploration, and Q values update policies in decision making. The proposed extension outperforms benchmark models in various scenarios, as evaluated through metrics such as loss, accuracy, speed, and reward values. The experiment results confirm the superiority of this extension in terms of complexity and robustness.

NEURAL COMPUTING & APPLICATIONS (2022)

Add to Collection

Article Engineering, Civil

Modeling the Effects of Autonomous Vehicles on Human Driver Car-Following Behaviors Using Inverse Reinforcement Learning

Xiao Wen, Sisi Jian, Dengbo He

Summary: The development of autonomous driving technology has led to the coexistence of human-driven vehicles (HVs) and autonomous vehicles (AVs) on the road. Understanding the interactions between AVs and HVs is crucial for traffic safety and efficiency. This study realistically models the dynamics and interactions between HVs following AVs, using data from the Waymo Open Dataset. The results show significant differences in HV behavior and preferences when following AVs compared to when following HVs, and the proposed model outperforms conventional and data-driven car-following models in trajectory predictions.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2023)

Add to Collection

Article Computer Science, Software Engineering

LiPTool: A tool for learning-based autonomous index placement in databases

Xiaoyue Feng, Dashan Wei, Tianzhe Jiao, Chaopeng Guo, Dongqi Wang, Jie Song

Summary: Nowadays, the distributed tree-based index is widely used for processing large-scale data queries. However, the current research on index placement lacks considering the distance between the index position and the query data (locality), which significantly affects the efficiency of index placement. This paper proposes a novel method using deep reinforcement learning (DRL) to solve this problem and adopts query performance as evaluation feedback in DRL. The proposed method, called LiPTool, maximizes the average query performance by utilizing DRL to find the optimal response server for each query and builds the index accordingly. Experimental results demonstrate that LiPTool improves query performance by doubling it compared to other methods. We believe that LiPTool is a promising solution for dynamically adjusting index placement in autonomous database management.

SOFTWAREX (2023)

Add to Collection

Article Automation & Control Systems

An autonomous decision-making framework for gait recognition systems against adversarial attack using reinforcement learning

Muazzam Maqsood, Sadaf Yasmin, Saira Gillani, Farhan Aadil, Irfan Mehmood, Seungmin Rho, Sang -Soo Yeo

Summary: Gait identification based on DL techniques has become a biometric technology for surveillance. We conducted a patch-based black-box adversarial attack with RL to expose the vulnerabilities and decision-making abilities of DL models in gait-based autonomous surveillance systems. The attack achieved encouraging results with a maximum success rate of 77.59%. It is important for researchers to explore system resilience scenarios before deploying these models in surveillance applications.

ISA TRANSACTIONS (2023)

Add to Collection

Article Engineering, Electrical & Electronic

Exploiting Multi-Modal Fusion for Urban Autonomous Driving Using Latent Deep Reinforcement Learning

Yasser H. Khalil, Hussein T. Mouftah

Summary: Researchers propose enhancing urban autonomous driving using multi-modal fusion with latent deep reinforcement learning. The method extracts and fuses images from multiple sensors to predict vehicle perception and motion, and then trains a driving policy using latent deep reinforcement learning to ensure safety, efficiency, and comfort. Experimental results show that the proposed method outperforms other existing models.

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Censored deep reinforcement patrolling with information criterion for monitoring large water resources using Autonomous Surface Vehicles

Samuel Yanes Luis, Daniel Gutierrez-Reina, Sergio Toral Marin

Summary: Monitoring and patrolling large water resources is a challenge for nature conservation. Autonomous Surface Vehicles equipped with water quality sensor modules can serve as an early-warning system for contamination peak-detection, algae blooms monitoring, or oil-spill scenarios. This study proposes a framework using censoring operator and noisy networks for collision-free policy and informative path planning.

APPLIED SOFT COMPUTING (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning

Shuhuan Wen, Zeteng Wen, Di Zhang, Hong Zhang, Tao Wang

Summary: The paper introduces the dynamic-PMPO-CMA algorithm for the adaptability of multi-robot systems in complex environments, which integrates meta-learning with dynamic-PPO-CMA to train robots to learn multi-task policy, successfully achieving obstacle avoidance and fast arrival at the destination.

APPLIED SOFT COMPUTING (2021)

Add to Collection

Article Computer Science, Information Systems

A Hierarchical Learning Approach to Autonomous Driving Using Rule Specifications

Kyunghoon Cho

Summary: This study tackles the challenging problems of understanding the movement of surrounding objects and controlling robot platforms in a safe way, such as in the case of autonomous vehicles. By combining sequence prediction and deep reinforcement learning in a hierarchical manner, the proposed method shows better efficiency and performance compared to existing learning-based control algorithms.

IEEE ACCESS (2022)

Add to Collection

Article Engineering, Civil

Physics Informed Deep Reinforcement Learning for Aircraft Conflict Resolution

Peng Zhao, Yongming Liu

Summary: A novel method utilizing physics informed deep reinforcement learning is proposed for aircraft conflict resolution in air traffic management, integrating prior physics understanding for optimal policy searching and human-explainable results. By using solution space diagram and convolution neural networks, this approach demonstrates faster convergence and better conflict resolution policy learning.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2022)

Add to Collection

Article Engineering, Marine

Generative adversarial interactive imitation learning for path following of autonomous underwater vehicle

Dong Jiang, Jie Huang, Zheng Fang, Chunxi Cheng, Qixin Sha, Bo He, Guangliang Li

Summary: This paper explores the application of generative adversarial imitation learning algorithm in AUV path following and proposes the GA2IL method combining demonstrated trajectories and additional human rewards. Experimental results in simulated underwater environments demonstrate that GA2IL and GAIL can achieve performance comparable to traditional methods in path following tasks.

OCEAN ENGINEERING (2022)

Add to Collection

Article Transportation Science & Technology

Autonomous anomaly detection on traffic flow time series with reinforcement

learning Dan He, Jiwon Kim, Hua Shi, Boyu Ruan

Summary: This study develops an autonomous AI agent to detect anomalies in traffic flow time series data. The agent learns anomaly patterns without supervision and does not require ground-truth labels or a threshold for anomaly definition. The model incorporates sequential information using reinforcement learning and achieves better performance compared to three state-of-the-art models, with around 90% precision, 80% recall, and 85% F1 score.

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES (2023)

Add to Collection

Article Computer Science, Theory & Methods

Reluplex: a calculus for reasoning about deep neural networks

Guy Katz, Clark Barrett, David L. Dill, Kyle Julian, Mykel J. Kochenderfer

Summary: Deep neural networks are widely used for solving complex real-world problems, but providing formal guarantees for their behavior in safety-critical systems is challenging. Researchers have developed a novel technique based on the simplex method to verify properties of deep neural networks efficiently and at scale.

FORMAL METHODS IN SYSTEM DESIGN (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Dynamic multi-robot task allocation under uncertainty and temporal constraints

Shushman Choudhury, Jayesh K. Gupta, Mykel J. Kochenderfer, Dorsa Sadigh, Jeannette Bohg

Summary: The study focuses on dynamically allocating tasks to multiple agents under time window constraints and task completion uncertainty, presenting a multi-agent allocation algorithm that excels in minimizing the number of unsuccessful tasks. The SCoBA algorithm effectively addresses key computational challenges through a hierarchical approach, demonstrating superior performance in practice and outperforming baseline methods in various metrics.

AUTONOMOUS ROBOTS (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Generating probabilistic safety guarantees for neural network controllers

Sydney M. Katz, Kyle D. Julian, Christopher A. Strong, Mykel J. Kochenderfer

Summary: Neural networks are effective controllers in complex settings, but their difficult-to-verify outputs restrict their use in safety-critical applications. Recent research focuses on using formal methods to verify neural network outputs. This study proposes a method to provide probabilistic safety guarantees for neural network controllers using results from neural network verification tools.

MACHINE LEARNING (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Global optimization of objective functions represented by ReLU networks

Christopher A. Strong, Haoze Wu, Aleksandar Zeljic, Kyle D. Julian, Guy Katz, Clark Barrett, Mykel J. Kochenderfer

Summary: Neural network failures in safety-critical contexts are difficult to guarantee, and verification algorithms can provide formal guarantees but cannot answer quantitative questions. This study proposes strategies to extend existing verifiers for optimization and finding extreme failures and minimum input perturbations. The proposed approach achieves better runtime performance than traditional methods and shows complementary performance with optimization-based verifiers.

MACHINE LEARNING (2023)

Add to Collection

Review Ergonomics

A Review of Incident Prediction, Resource Allocation, and Dispatch Models for Emergency Management

Ayan Mukhopadhyay, Geoffrey Pettet, Sayyed Mohsen Vazirizade, Di Lu, Alejandro Jaimes, Said El Said, Hiba Baroud, Yevgeniy Vorobeychik, Mykel Kochenderfer, Abhishek Dubey

Summary: In the past fifty years, researchers have developed various statistical, data-driven, analytical, and algorithmic approaches for designing and improving emergency response management systems. This survey provides a detailed review of these approaches, focusing on the key challenges and issues in incident prediction, incident detection, resource allocation, and computer-aided dispatch. It highlights the strengths and weaknesses of prior work and explores the similarities and differences between different modeling paradigms. The survey concludes by discussing open challenges and opportunities for future research in this complex domain.

ACCIDENT ANALYSIS AND PREVENTION (2022)

Add to Collection

Article Neurosciences

Towards assessing subcortical deep brain biomarkers of PTSD with functional near-infrared spectroscopy

Stephanie Balters, Marc R. Schlichting, Lara Foland-Ross, Sabrina Brigadoi, Jonas G. Miller, Mykel J. Kochenderfer, Amy S. Garrett, Allan L. Reiss

Summary: This study validates the feasibility of inferring activity in subcortical regions associated with posttraumatic stress disorder (PTSD) using cortical functional magnetic resonance imaging (fMRI) and simulated functional near-infrared spectroscopy (fNIRS) activity. Linear regression and neural network models provided the best prediction performance.

CEREBRAL CORTEX (2023)

Add to Collection

Article Engineering, Multidisciplinary

Portfolio construction as linearly constrained separable optimization

Nicholas Moehle, Jack Gindi, Stephen Boyd, Mykel J. Kochenderfer

Summary: This paper presents a heuristic algorithm based on the ADMM method to solve separable nonconvex terms in mean-variance portfolio optimization problems and empirically demonstrates its effectiveness in tax-aware portfolio construction.

OPTIMIZATION AND ENGINEERING (2023)

Add to Collection

Article Engineering, Civil

Modeling Human Driving Behavior Through Generative Adversarial Imitation Learning

Raunak Bhattacharyya, Blake Wulfe, Derek J. Phillips, Alex Kuefler, Jeremy Morton, Ransalu Senanayake, Mykel J. Kochenderfer

Summary: An open problem in autonomous vehicle safety validation is to build reliable models of human driving behavior in simulation. This work presents an approach to learn neural driving policies from real-world driving demonstration data. The approach uses imitation learning and Generative Adversarial Imitation Learning (GAIL) to effectively model human driving behavior.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2023)

Add to Collection

Proceedings Paper Automation & Control Systems

How Do We Fail? Stress Testing Perception in Autonomous Vehicles

Harrison Delecki, Masha Itkina, Bernard Lange, Ransalu Senanayake, Mykel J. Kochenderfer

Summary: This paper presents a method for characterizing failures of LiDAR-based perception systems for autonomous vehicles (AVs) in adverse weather conditions. By using reinforcement learning to introduce disturbances and simulate LiDAR point clouds in adverse weather conditions, the likely failures in object tracking and trajectory prediction are identified. Experimental results show that the proposed approach can find high likelihood failures with smaller input disturbances compared to baselines, while remaining computationally tractable. The identified failures can inform the future development of robust perception systems for AVs.

2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) (2022)

Add to Collection

Proceedings Paper Automation & Control Systems

Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Maneekwan Toyungyernsub, Esen Yel, Jiachen Li, Mykel J. Kochenderfer

Summary: In this paper, a framework that integrates the detection and segmentation of moving obstacles with the prediction of future occupancy states of the local environment is proposed using deep neural network architectures. By utilizing occupancy-based environment representations, the problem of integrating static-dynamic object segmentation and environment prediction models directly is addressed. The method is validated on the real-world Waymo Open Dataset and exhibits higher prediction accuracy compared to baseline methods.

2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) (2022)

Add to Collection

Proceedings Paper Automation & Control Systems

Multi-Objective Policy Gradients with Topological Constraints

Kyle Hollins Wray, Stas Tiomkin, Mykel J. Kochenderfer, Pieter Abbeel

Summary: This paper introduces a multi-objective optimization model that encodes ordered sequential constraints to solve various challenging problems. By extending topological Markov decision processes (TMDPs) to continuous spaces and unknown transition dynamics, the policy gradient theorem for TMDPs is formulated, proven, and implemented, enabling the creation of TMDP learning algorithms that can generalize existing deep reinforcement learning (DRL) approaches.

2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) (2022)

Add to Collection

Proceedings Paper Automation & Control Systems

FIG-OP: Exploring Large-Scale Unknown Environments on a Fixed Time Budget

Oriana Peltzer, Amanda Bouman, Sung-Kyun Kim, Ransalu Senanayake, Joshua Ott, Harrison Delecki, Mamoru Sobue, Mykel J. Kochenderfer, Mac Schwager, Joel Burdick, Ali-akbar Agha-mohammadi

Summary: We present a method for autonomous exploration of large-scale unknown environments under time constraints. Our approach addresses model uncertainty by frontloading expected information gain, leading to improved coverage efficiency over traditional methods.

2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) (2022)

Add to Collection

Proceedings Paper Automation & Control Systems

Adaptive Coverage Path Planning for Efficient Exploration of Unknown Environments

Amanda Bouman, Joshua Ott, Sung-Kyun Kim, Kenny Chen, Mykel J. Kochenderfer, Brett Lopez, Ali-akbar Agha-Mohammadi, Joel Burdick

Summary: This method presents a solution to the coverage problem, aiming to autonomously explore an unknown environment under mission time constraints. It formulates the problem as a tree-based sequential decision making process to evaluate the effects of the robot's actions on future coverage states, considering traversability risk and dynamic constraints. An effective approximation to the coverage sensor model is proposed to quickly find near-optimal solutions.

2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) (2022)

Add to Collection

Article Engineering, Aerospace

Verification of Image-Based Neural Network Controllers Using Generative Models

Sydney M. Katz, Anthony L. Corso, Christopher A. Strong, Mykel J. Kochenderfer

Summary: This research proposes a method to enhance the safety of image-based neural network controllers by combining generative adversarial networks and control networks to transform the complex input space into a low-dimensional space, enabling the use of existing verification tools to provide formal guarantees on their performance.

JOURNAL OF AEROSPACE INFORMATION SYSTEMS (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Scalable Online Planning for Multi-Agent MDPs

Shushman Choudhury, Jayesh K. Gupta, Peter Morales, Mykel J. Kochenderfer

Summary: We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Our algorithm allows for trading computation for approximation quality and dynamically coordinating actions.

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH (2022)

Add to Collection

No Data Available

© Peeref 2019-2024. All rights reserved.