4.6 Article

Multi-output Gaussian processes for species distribution modelling

Journal

METHODS IN ECOLOGY AND EVOLUTION
Volume 11, Issue 12, Pages 1587-1598

Publisher

WILEY
DOI: 10.1111/2041-210X.13496

Keywords

Gaussian process; multi-species modelling; species distribution models

Categories

Funding

  1. ARC DECRA fellowship [DE180100635]
  2. LIEF Grant [LE170100200]
  3. Melbourne Research, University of Melbourne
  4. Australian Research Council [DE180100635] Funding Source: Australian Research Council

Ask authors/readers for more resources

Species distribution modelling is an active area of research in ecology. In recent years, interest has grown in modelling multiple species simultaneously, partly due to the ability to 'borrow strength' from similar species to improve predictions. Mixed and hierarchical models allow this but typically assume a (generalised) linear relationship between covariates and species presence and absence. On the other hand, popular machine learning techniques such as random forests and boosted regression trees are able to model complex nonlinear relationships but consider only one species at a time. We apply multi-output Gaussian processes (MOGPs) to the problem of species distribution modelling. MOGPs model each species' response to the environment as a weighted sum of a small number of nonlinear functions, each modelled by a Gaussian process. While Gaussian process models are notoriously computationally intensive, recent techniques from the machine learning literature as well as using graphics processing units (GPUs) allow us to scale the model to datasets with hundreds of species at thousands of sites. We evaluate the MOGP against four baseline models on six different datasets. Overall, the MOGP is competitive with the best single-species and joint-species models, while being much faster to fit. On single-species metrics (AUC and log likelihood), the MOGP and single-output GPs outperformed tree-based models (random forest and boosted regression trees) and a joint species distribution model (JSDM). Compared to single-output GPs, the MOGP generally has a higher AUC for rare species with fewer than 50 observation in the dataset. When evaluated using joint-species log likelihood, the MOGP outperforms all models apart from the JSDM, which has a better joint likelihood on three datasets and similar performance on the three others. A key advantage of the MOGP is speed: on the largest dataset, it is around 18 times faster than fitting single output GPs, and over 80 times faster to fit than the JSDM. Our results suggest that both MOGPs and SOGPs are accurate predictive models of species distributions and that the MOGP is particularly compelling when predictions for rare species are of interest.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available