4.7 Article

Hierarchical Recovery of Missing Air Pollution Data via Improved Long-Short Term Context Encoder Network

Journal

IEEE TRANSACTIONS ON BIG DATA
Volume 9, Issue 1, Pages 93-105

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TBDATA.2021.3123819

Keywords

Air pollution; Pollution measurement; Atmospheric measurements; Monitoring; Atmospheric modeling; Correlation; Big Data; Missing data recovery; air quality measurement; ILSCE; adaptive updating convolutional neural networks

Ask authors/readers for more resources

Due to equipment and transmission failures, data loss is a major challenge in air quality monitoring. This paper proposes a Long-short term context encoder (ILSCE) model, utilizing adaptive updating convolutional neural networks (CNNs), to recover missing air quality data from a database. The model can capture temporal-spatial correlations and periodic variations in the air quality dataset and update the data and masks in every layer of CNN. The experimental study shows that the ILSCE model outperforms existing imputation methods in air pollution data recovery, especially in cases of severe data loss.
Due to equipment and transmission failures, data loss presents a key challenge to air quality monitoring. This paper attempts to recover missing air quality data from an air quality database. Leveraging adaptive updating convolutional neural networks (CNNs), we propose a novel Long-short term context encoder (ILSCE) model, which can simultaneously capture any temporal-spatial correlation and periodic variation identified from an air quality dataset. In addition, our model applies a new mechanism to automatically update both the air quality data and their corresponding masks in every single layer of CNN. Our proposed method presents three novelties. First, it hierarchically recovers any missing air quality values. Second, domain specific weekday/weekend and seasonal information are incorporated into the training model. Third, model performance is enhanced by an additional regularization term that captures the correlation between different air pollutants, thereby considering both background ambient pollution and local emissions. Our experimental study shows these three newly proposed features allow the ILSCE model to significantly outperform existing state-of-the-art imputation methods in air pollution data recovery. Furthermore, as data loss becomes more severe, with more missing data and more consecutively missing data, the superior recovery performance and greater robustness of our model become more prominent.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available