4.5 Article

Modelling post-fire tree mortality: Can random forest improve discrimination of imbalanced data?

Journal

ECOLOGICAL MODELLING
Volume 414, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.ecolmodel.2019.108855

Keywords

Fire effects; Logistic regression; Machine learning; Model evaluation; Model validation; Pinus palustris; Prescribed fire

Categories

Funding

  1. US Joint Fire Science Program [01-1-03-11, 16-1-04-8]
  2. National Fire Plan

Ask authors/readers for more resources

Predicting post-fire tree mortality is a major area of research in fire-prone forests, woodlands, and savannas worldwide. Past research has relied overwhelmingly on logistic regression analysis (LR) that predicts post-fire tree status as a binary outcome (i.e. living or dead). One of the most problematic issues for LR (or any classification problem) occurs when there is a class imbalance in the training data. In these instances, predictions will be biased toward the majority class. Using a historical prescribed fire data set of longleaf pines (Pines palustris) from northern Florida, USA, we compare results from standard LR and the machine-learning algorithm, random forest (RF). First, we demonstrate the class imbalance problem using simulated data. We then show how a balanced RF model can be used to alleviate the bias in the model and improve mortality prediction results. In the simulated example, LR model sensitivity and specificity was clearly biased based on the degree of imbalance between the classes. The balanced RF models had consistent sensitivity and specificity throughout the simulated data sets. Re-analyzing the original longleaf pine data set with a balanced RF model showed that although both LR and RF models had similar areas under the receiver operating curve (AUC), the RF model had better discrimination for predicting new observations of dead trees. Both LR and RF models identified duff consumption and percent crown scorch as important predictors of tree mortality, however the RF model also suggested prefire duff depth as an important predictor. Our analysis highlights LR limitations when data are imbalanced and supports using RF to develop post-fire tree mortality models. We suggest how RF can be incorporated into future tree mortality studies, as well as possible implementation in future decision-support tools.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available