4.2 Article

Forecasting Excessive Rainfall with Random Forests and a Deterministic Convection-Allowing Model

Journal

WEATHER AND FORECASTING
Volume 36, Issue 5, Pages 1693-1711

Publisher

AMER METEOROLOGICAL SOC
DOI: 10.1175/WAF-D-21-0026.1

Keywords

Rainfall; Numerical weather prediction/forecasting; Operational forecasting; Machine learning; Decision trees; Decision trees

Funding

  1. National Science Foundation
  2. NOAA Joint Technology Transfer Initiative Grant [NA18OAR4590378]

Ask authors/readers for more resources

This study explores using machine learning models to predict excessive rainfall events, with sensitivity to different definitions and predictors. Adjusting model configurations can improve forecast skill, demonstrating potential for ML-based models in excessive rainfall prediction but also requiring careful attention to configuration details.
Approximately seven years of daily initializations from the convection-allowing National Severe Storms Laboratory Weather Research and Forecasting Model are used as inputs to train random forest (RF) machine learning models to probabilistically predict instances of excessive rainfall. Unlike other hazards, excessive rainfall does not have an accepted definition, so multiple definitions of excessive rainfall and flash flooding-including flash flood reports and 24-h average recurrence intervals (ARIs)-are used to explore RF configuration forecast sensitivities. RF forecasts are analogous to operational Weather Prediction Center (WPC) day-1 Excessive Rainfall Outlooks (EROs) and their resolution, reliability, and skill are strongly influenced by rainfall definitions and how inputs are assembled for training. Models trained with 1-yr ARI exceedances defined by the Stage-IV (ST4) precipitation analysis perform poorly in the northern Great Plains and Southwest United States, in part due to a high bias in the number of training events in these regions. Increasing the ARI threshold to 2 years or removing ST4 data from training, optimizing forecast skill geographically, and spatially averaging meteorological inputs for training generally results in improved CONUS-wide RF forecast skill. Both EROs and RF forecasts have seasonal skill--poor forecasts in the late fall and winter and skillful forecasts in the summer and early fall. However, the EROs are consistently and significantly better than their RF counterparts, regardless of RF configuration, particularly in the summer months. The results suggest careful consideration should be made when developing ML-based probabilistic precipitation forecasts with convection-allowing model inputs, and further development is necessary to consider these forecast products for operational implementation. SIGNIFICANCE STATEMENT Machine learning (ML) models can deduce statistical relationships between a set of predictors and meteorological events. In this work, ML models are developed to predict excessive rainfall events. Since excessive rainfall is difficult to uniformly define across the United States, multiple ML models are built from a variety of rainfall datasets with predictors gathered from output of a high-resolution numerical weather prediction model and forecasts are made from each model. Forecasts made from these models are highly sensitive to both the definitions of excessive rainfall (e.g., 100 mm of rain in a day may cause flooding in a usually dry area, but not in a wet area) and the predictors used. Forecast skill can increase when excessive rainfall events are rarer and when predictors synthesize the surrounding environment rather than characterize specific geographical points. ML-based models have great potential for excessive rainfall prediction, but careful attention to the configuration of these models is required.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available