4.7 Article

Is more data always better? A simulation study of benefits and limitations of integrated distribution models

期刊

ECOGRAPHY
卷 43, 期 10, 页码 1413-1422

出版社

WILEY
DOI: 10.1111/ecog.05146

关键词

citizen science; data integration; integrated distribution models; simulations; species distribution models

资金

  1. Natural Environment Research Council [NE/R005133/1, NE/R016429/1]
  2. NERC [NE/R005133/1] Funding Source: UKRI
  3. Natural Environment Research Council [NE/R005133/1] Funding Source: researchfish

向作者/读者索取更多资源

Species distribution models are popular and widely applied ecological tools. Recent increases in data availability have led to opportunities and challenges for species distribution modelling. Each data source has different qualities, determined by how it was collected. As several data sources can inform on a single species, ecologists have often analysed just one of the data sources, but this loses information, as some data sources are discarded. Integrated distribution models (IDMs) were developed to enable inclusion of multiple datasets in a single model, whilst accounting for different data collection protocols. This is advantageous because it allows efficient use of all data available, can improve estimation and account for biases in data collection. What is not yet known is when integrating different data sources does not bring advantages. Here, for the first time, we explore the potential limits of IDMs using a simulation study integrating a spatially biased, opportunistic, presence-only dataset with a structured, presence-absence dataset. We explore four scenarios based on real ecological problems; small sample sizes, low levels of detection probability, correlations between covariates and a lack of knowledge of the drivers of bias in data collection. For each scenario we ask; do we see improvements in parameter estimation or the accuracy of spatial pattern prediction in the IDM versus modelling either data source alone? We found integration alone was unable to correct for spatial bias in presence-only data. Including a covariate to explain bias or adding a flexible spatial term improved IDM performance beyond single dataset models, with the models including a flexible spatial term producing the most accurate and robust estimates. Increasing the sample size of presence-absence data and having no correlated covariates also improved estimation. These results demonstrate under which conditions integrated models provide benefits over modelling single data sources.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据