Article
Engineering, Aerospace
Shuo Chen, Hartmut Helmke, Robert M. M. Tarakan, Oliver Ohneiser, Hunter Kopald, Matthias Kleinert
Summary: As the use of Automatic Speech Recognition and Understanding (ASRU) in Air Traffic Management (ATM) is developed worldwide, the importance of Air Traffic Control (ATC) language ontologies in facilitating research collaboration becomes evident. This paper extends the topic by discussing the specific ways in which ontologies enable the sharing and collaboration of data, models, algorithms, metrics, and applications in the ATM domain. Additionally, a comparative analysis of word frequencies in ATC speech between the United States and Europe highlights the need for region-specific models due to differences in underlying corpus data.
Article
Engineering, Aerospace
Nils Ahrenhold, Hartmut Helmke, Thorsten Muehlhausen, Oliver Ohneiser, Matthias Kleinert, Heiko Ehr, Lucas Klamert, Juan Zuluaga-Gomez
Summary: This study investigates the use of automatic speech recognition and understanding (ASRU) in air traffic control. Findings show that ASRU support can reduce workload and improve safety and human performance.
Article
Engineering, Aerospace
Matthias Kleinert, Oliver Ohneiser, Hartmut Helmke, Shruthi Shetty, Heiko Ehr, Mathias Maier, Susanne Schacht, Hanno Wiese
Summary: This article explains how assistant-based speech recognition (ABSR) technology can be integrated into an advanced surface movement guidance and control system (A-SMGCS) to reduce controllers' workload and improve safety and overall performance. The integration of A-SMGCS and ABSR improves the command recognition rate by more than 15%, with a recognition rate of 91.8% for commands and 97.4% for callsigns, effectively reducing controllers' workload and enhancing safety and overall performance.
Review
Engineering, Aerospace
Yi Lin
Summary: This paper provides a comprehensive review on spoken instruction understanding (SIU) in the ATC domain, covering challenges, techniques, and applications. It discusses the full pipeline for achieving the SIU task, analyzes technique challenges specific to ATC tasks, categorizes common techniques for SIU tasks, and reviews extensive works in the ATC domain. Future research topics are also prospected to contribute to the research community.
Article
Engineering, Civil
Sandeep Badrinath, Hamsa Balakrishnan
Summary: Automatic transcription of air traffic control (ATC) communications has the potential to improve system safety, operational performance, and conformance monitoring. A tailored automatic speech recognition model has been developed to transcribe ATC voice to text and extract operational information. The model is based on recent advancements in machine learning techniques and has been evaluated on diverse datasets.
TRANSPORTATION RESEARCH RECORD
(2022)
Article
Engineering, Aerospace
Raquel Garcia, Juan Albarran, Adrian Fabio, Fernando Celorrio, Carlos Pinto de Oliveira, Cristina Barcena
Summary: In the air traffic management environment, air traffic controllers and flight crews communicate via voice using speech recognition to improve situational awareness and safety. This paper presents the work being done to develop ASR models for callsign recognition and highlights the need for partial recognition and improved phonetization to enhance recognition rates.
Article
Engineering, Electrical & Electronic
Reinhold Haeb-Umbach, Jahn Heymann, Lukas Drude, Shinji Watanabe, Marc Delcroix, Tomohiro Nakatani
Summary: Far-field automatic speech recognition (ASR) has gained significant attention and application in science and industry, with consumer market adoption for digital home assistants. Signal enhancement and robust ASR engine are key to improving recognition accuracy, with a combination of deep learning and traditional signal processing proving to be an effective solution.
PROCEEDINGS OF THE IEEE
(2021)
Article
Computer Science, Artificial Intelligence
Jin Ren, Shunzhi Yang, Yihua Shi, Jinfeng Yang
Summary: This article introduces knowledge distillation into ASR for Mandarin ATC communications to enhance the generalization performance of the lightweight model. By using the Target-Swap Knowledge Distillation (TSKD) strategy, the potential overconfidence of the teacher model regarding the target class can be mitigated. Experimental results demonstrate that the generated lightweight ASR model achieves a balance between recognition accuracy and transcription latency.
PEERJ COMPUTER SCIENCE
(2023)
Article
Computer Science, Information Systems
Peng Luo, Buhong Wang, Tengyao Li, Jiwei Tian
Summary: A new anomaly detection model is proposed to detect ADS-B data attacks effectively, the VAE-SVDD model can identify various anomalies generated by attacks with lower false positive and false negative rates compared to other machine learning methods.
COMPUTERS & SECURITY
(2021)
Article
Mathematics
Yuri Matveev, Anton Matveev, Olga Frolova, Elena Lyakso, Nersisson Ruban
Summary: This paper introduces an extended description of a database containing emotional speech in the Russian language of younger school age children. The validation results show the superiority of classical machine learning algorithms such as SVM and MLP in recognizing emotions. This database can be a valuable resource for researchers studying affective reactions in speech communication during child-computer interactions.
Article
Engineering, Aerospace
Dongyue Guo, Zichen Zhang, Peng Fan, Jianwei Zhang, Bo Yang
Summary: In this work, a context-aware language model (CALM) is proposed to improve the performance of callsign identification in air traffic control by utilizing prior ATC knowledge. Experimental results demonstrate the superiority of CALM over other baselines, showcasing its potential for migration to a real-time environment.
Article
Computer Science, Information Systems
Amandeep Singh Dhanjal, Williamjeet Singh
Summary: The continuous development in Automatic Speech Recognition has demonstrated its potential in Human Interaction Communication systems, but achieving high accuracy is a challenging task due to various parameters. Researchers have made innovative contributions to the development of a robust speech recognition system. This study analyzes the state-of-the-art research in this field during 2015-2021, focusing on neural network-based techniques, datasets, toolkits, and evaluation metrics. It provides empirical solutions for accuracy improvement and offers a brief knowledge to new researchers.
MULTIMEDIA TOOLS AND APPLICATIONS
(2023)
Article
Computer Science, Artificial Intelligence
Nadia Nedjah, Alejandra D. Bonilla, Luiza de Macedo Mourelle
Summary: Automatic speech recognition based on phoneme detection provides advantages for online speech recognition. The development of such a system is multidisciplinary, involving linguistics, signal processing, and computational intelligence. This study proposes a novel approach that divides the decision space of speech recognition using an ensemble of neural network experts, leading to improved precision, sensitivity, and accuracy. A dynamic post-processing step is also employed to mitigate the oscillatory effect during recognition.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Engineering, Aerospace
Chen Deng, Chengqi Cheng, Tengteng Qu, Shuang Li, Bo Chen
Summary: This paper proposes a multi-level disaggregated framework for airspace information management, which achieves effective management and extraction of air traffic data through the adoption of a multi-scale grid modeling and coding mapping method. Experimental results demonstrate that the framework complies with the existing Chinese specification of civil aeronautical charting, exhibits low deformation and high practicality, and improves performance by over 80% in ADS-B data extraction. Therefore, this method can provide effective reference and practical information for existing ADS-B airspace data management methods and can be extended to other forms of airspace management scenarios.
Article
Acoustics
Shun-Po Chuang, Alexander H. Liu, Tzu-Wei Sung, Hung-yi Lee
Summary: This article proposes lightweight approaches to utilizing word embeddings to improve ASR and end-to-end ST models. Word embeddings provide additional contextual information and alleviate data scarcity issues, leading to performance improvements through knowledge distillation.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
(2021)