4.6 Article

Spatial Random Forest (S-RF): A random forest approach for spatially interpolating missing land-cover data with multiple classes

Journal

INTERNATIONAL JOURNAL OF REMOTE SENSING
Volume 42, Issue 10, Pages 3756-3776

Publisher

TAYLOR & FRANCIS LTD
DOI: 10.1080/01431161.2021.1881183

Keywords

-

Funding

  1. Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers [CE140100049]
  2. Australian Government
  3. ARC Fellowship [DE200101791]
  4. Australian Research Council [DE200101791] Funding Source: Australian Research Council

Ask authors/readers for more resources

This study presents a Spatial Random Forest model that accurately interpolates missing data in satellite images. The model shows efficient estimation of multiple land-cover classes using only latitude and longitude as spatial covariates. Performance of the model is influenced by the amount of missing data due to cloud cover and training data size for model accuracy.
Land-cover maps are important tools for monitoring large-scale environmental change and can be regularly updated using free satellite imagery data. A key challenge with constructing these maps is missing data in the satellite images on which they are based. To address this challenge, we created a Spatial Random Forest (S-RF) model that can accurately interpolate missing data in satellite images based on a modest training set of observed data in the image of interest. We demonstrate that this approach can be effective with only a minimal number of spatial covariates, namely latitude and longitude. The motivation for only using latitude and longitude in our model is that these covariates are available for all images whether the data are observed or missing due to cloud cover. The S-RF model can flexibly partition these covariates to provide accurate estimates, with easy incorporation of additional covariates to improve estimation if available. The effectiveness of our approach has been previously demonstrated for prediction of two land-cover classes in an Australian case study. In this paper, we extend the method to more than two classes. We demonstrate the performance of the S-RF method at interpolating multiple land-cover classes, using a case study drawn from South America. The results show that the method is best at predicting three land-cover classes, compared with 5 or 10 classes, and that other information is needed to improve performance as the number of classes grows, particularly if the classes are unbalanced. We explore two issues through a sensitivity analysis: the influence of the amount of missing data in the image and the influence of the amount of training data for model development and performance. The results show that the amount of missing data due to cloud cover is influential on model performance for multiple classes. We also found that increasing the amount of training data beyond 100,000 observations had minimal impact on model accuracy. Hence, a relatively small amount of observed data is required for training the model, which is beneficial for computation time.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available