期刊
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
卷 106, 期 22, 页码 8859-8864出版社
NATL ACAD SCIENCES
DOI: 10.1073/pnas.0903931106
关键词
higher criticism; phase diagram; region of impossibility; region of possibility; threshold feature selection
资金
- National Science Foundation [DMS-0908613]
- Direct For Mathematical & Physical Scien
- Division Of Mathematical Sciences [0908613] Funding Source: National Science Foundation
We study a two-class classification problem with a large number of features, out of which many are useless and only a few are useful, but we do not know which ones they are. The number of features is large compared with the number of training observations. Calibrating the model with 4 key parameters-the number of features, the size of the training sample, the fraction, and strength of useful features-we identify a region in parameter space where no trained classifier can reliably separate the two classes on fresh data. The complement of this region-where successful classification is possible-is also briefly discussed.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据