4.7 Article

Examining the role of class imbalance handling strategies in predicting earthquake-induced landslide-prone regions

Journal

APPLIED SOFT COMPUTING
Volume 143, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.asoc.2023.110429

Keywords

Earthquake; Landslide; Machine learning; Class imbalance; SHAP; Explainable artificial intelligence

Ask authors/readers for more resources

This study proposed a comprehensive prediction scheme to assess earthquake-induced landslide susceptibility in the North Sikkim region using class imbalance handling strategies and machine learning methods. The evaluation included nine scenarios of oversampling, undersampling, and raw data analysis techniques. The stochastic gradient boosting algorithm achieved the best performance with SVM-SMOTE-SGB outperforming other models. Additionally, a game-theoretical SHapley Additive explanation analysis was used to overcome the lack of interpretability in black-box models and identified the importance of distance to road, distance to stream, and elevation in identifying landslide prone regions caused by earthquakes. & COPY; 2023 Elsevier B.V. All rights reserved.
This study was undertaken to propose a comprehensive prediction scheme containing the hybrid use of class imbalance handling strategies and machine learning methods to assess theearthquake-induced landslide susceptibility for the North Sikkim region. It is worth to mention that taking the class imbalance handling techniques into account is essential to mimic real-world conditions. To tackle this issue, this research for the first time focused on the comprehensive evaluation of nine scenarios comprising four oversampling, four undersampling, and a RAW data analysis techniques. The predictions were conducted with the stochastic gradient boosting (SGB) algorithm. Analysis results depicted that the SVM-SMOTE-SGB outperformed its counterparts (with an AUROC of 0.9878), followed by the models subjected to the pre-processing with BL-SMOTE (AUROC: 0.9876) and RUS (AUROC: 0.9859), respectively. Also, the major drawback of the black-box models, i.e., lack of interpretability, was overcome with a game-theoretical SHapley Additive explanation (SHAP) analysis. The SHAP application with respect to the best-performed model ensured the importance of distance to road, distance to stream, and elevation in the identification of earthquake-induced landslide prone regions. & COPY; 2023 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available