4.7 Article

A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers

Journal

AEROSPACE
Volume 10, Issue 5, Pages -

Publisher

MDPI
DOI: 10.3390/aerospace10050490

Keywords

air traffic controller training; simulation-pilot agent; BERT; automatic speech recognition and understanding; speech synthesis

Ask authors/readers for more resources

In this paper, a novel virtual simulation-pilot engine is proposed to accelerate the training of air traffic controllers (ATCo) by integrating state-of-the-art AI-based tools. The engine performs automatic speech recognition and understanding, going beyond transcription to comprehend the meaning of spoken communications. The system employs advanced AI tools and is built on open-source ATC resources, and it can be enhanced with real-time surveillance data and deliberate read-back errors.
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI)-based tools. The virtual simulation-pilot engine receives spoken communications from ATCo trainees, and it performs automatic speech recognition and understanding. Thus, it goes beyond only transcribing the communication and can also understand its meaning. The output is subsequently sent to a response generator system, which resembles the spoken read-back that pilots give to the ATCo trainees. The overall pipeline is composed of the following submodules: (i) an automatic speech recognition (ASR) system that transforms audio into a sequence of words; (ii) a high-level air traffic control (ATC)-related entity parser that understands the transcribed voice communication; and (iii) a text-to-speech submodule that generates a spoken utterance that resembles a pilot based on the situation of the dialogue. Our system employs state-of-the-art AI-based tools such as Wav2Vec 2.0, Conformer, BERT and Tacotron models. To the best of our knowledge, this is the first work fully based on open-source ATC resources and AI tools. In addition, we develop a robust and modular system with optional submodules that can enhance the system's performance by incorporating real-time surveillance data, metadata related to exercises (such as sectors or runways), or even a deliberate read-back error to train ATCo trainees to identify them. Our ASR system can reach as low as 5.5% and 15.9% absolute word error rates (WER) on high- and low-quality ATC audio. We also demonstrate that adding surveillance data into the ASR can yield a callsign detection accuracy of more than 96%.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Transportation

Prediction and extraction of tower controller commands for speech recognition applications

Oliver Ohneiser, Hartmut Helmke, Shruthi Shetty, Matthias Kleinert, Heiko Ehr, Sarunas Murauskas, Tomas Pagirys

Summary: Research has shown that automatic speech recognition systems can significantly reduce air traffic controllers' workload and increase air traffic capacity, but they require accurate command hypotheses and extractions to achieve the desired performance.

JOURNAL OF AIR TRANSPORT MANAGEMENT (2021)

Article Computer Science, Information Systems

Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition

Qingran Zhan, Xiang Xie, Chenguang Hu, Juan Zuluaga-Gomez, Jing Wang, Haobo Cheng

Summary: This paper investigates the extraction of reliable AFs using a DANN for cross-lingual speech recognition. By training AFs detectors in source languages and transferring phonological knowledge to the target language, along with the fusion of acoustic features and cross-lingual AFs using multi-stream techniques, improved performance is achieved. The experiments show that using CNN with domain-adversarial learning and the MHA-based multi-stream approach yield significant improvements in performance compared to other methods, especially when considering low-resource languages.

ELECTRONICS (2021)

Article Engineering, Aerospace

Validating Automatic Speech Recognition and Understanding for Pre-Filling Radar Labels-Increasing Safety While Reducing Air Traffic Controllers' Workload

Nils Ahrenhold, Hartmut Helmke, Thorsten Muehlhausen, Oliver Ohneiser, Matthias Kleinert, Heiko Ehr, Lucas Klamert, Juan Zuluaga-Gomez

Summary: This study investigates the use of automatic speech recognition and understanding (ASRU) in air traffic control. Findings show that ASRU support can reduce workload and improve safety and human performance.

AEROSPACE (2023)

Review Engineering, Chemical

Techniques for water disinfection, decontamination and desalinization: a review

J. Zuluaga-Gomez, P. Bonaveri, D. Zuluaga, C. Alvarez-Pena, N. Ramirez-Ortiz

DESALINATION AND WATER TREATMENT (2020)

No Data Available