4.6 Article

Cross-Validation for Correlated Data

期刊

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
卷 117, 期 538, 页码 718-731

出版社

TAYLOR & FRANCIS INC
DOI: 10.1080/01621459.2020.1801451

关键词

Dependent data; Gaussian process regression; Linear mixed model; Model selection; Prediction error estimation

资金

  1. Israeli Science Foundation [1804/16]

向作者/读者索取更多资源

This article analyzes the application of K-fold cross-validation in correlated data and introduces a criterion and a correction method, which significantly improves the performance of model evaluation and selection.
K-fold cross-validation (CV) with squared error loss is widely used for evaluating predictive models, especially when strong distributional assumptions cannot be taken. However, CV with squared error loss is not free from distributional assumptions, in particular in cases involving non-iid data. This article analyzes CV for correlated data. We present a criterion for suitability of standard CV in presence of correlations. When this criterion does not hold, we introduce a bias corrected CV estimator which we termthat yields an unbiased estimate of prediction error in many settings where standard CV is invalid. We also demonstrate our results numerically, and find that introducing our correction substantially improves both, model evaluation and model selection in simulations and real data studies.for this article are available online.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据