4.8 Article

The reusable holdout: Preserving validity in adaptive data analysis

期刊

SCIENCE
卷 349, 期 6248, 页码 636-638

出版社

AMER ASSOC ADVANCEMENT SCIENCE
DOI: 10.1126/science.aaa9375

关键词

-

资金

  1. NSF CAREER grant [CNS 1253345]
  2. NSF [CCF 1101389]
  3. Alfred P. Sloan Foundation
  4. Natural Sciences and Engineering Research Council of Canada
  5. Direct For Computer & Info Scie & Enginr
  6. Division Of Computer and Network Systems [1253345] Funding Source: National Science Foundation
  7. Direct For Computer & Info Scie & Enginr
  8. Division of Computing and Communication Foundations [1101389] Funding Source: National Science Foundation
  9. Division Of Computer and Network Systems
  10. Direct For Computer & Info Scie & Enginr [1065060, 1523467] Funding Source: National Science Foundation

向作者/读者索取更多资源

Misapplication of statistical data analysis is a common cause of spurious discoveries in scientific research. Existing approaches to ensuring the validity of inferences drawn from data assume a fixed procedure to be performed, selected before the data are examined. In common practice, however, data analysis is an intrinsically adaptive process, with new analyses generated on the basis of data exploration, as well as the results of previous analyses on the same data. We demonstrate a new approach for addressing the challenges of adaptivity based on insights from privacy-preserving data analysis. As an application, we show how to safely reuse a holdout data set many times to validate the results of adaptively chosen analyses.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据