期刊
IEEE TRANSACTIONS ON INFORMATION THEORY
卷 59, 期 1, 页码 546-572出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIT.2012.2212415
关键词
Dimension reduction; outlier; principal component analysis (PCA); robustness; statistical learning
资金
- Ministry of Education of Singapore through the National University of Singapore [R-265-000-384-133]
- U.S. National Science Foundation [EFRI-0735905, EECS-1056028]
- Defence Threat Reduction Agency [HDTRA 1-08-0029]
- Israel Science Foundation [890015]
- Directorate For Engineering [1056028] Funding Source: National Science Foundation
Principal component analysis plays a central role in statistics, engineering, and science. Because of the prevalence of corrupted data in real-world applications, much research has focused on developing robust algorithms. Perhaps surprisingly, these algorithms are unequipped-indeed, unable-to deal with outliers in the high-dimensional setting where the number of observations is of the same magnitude as the number of variables of each observation, and the dataset contains some (arbitrarily) corrupted observations. We propose a high-dimensional robust principal component analysis algorithm that is efficient, robust to contaminated points, and easily kernelizable. In particular, our algorithm achieves maximal robustness-it has a breakdown point of 50% (the best possible), while all existing algorithms have a breakdown point of zero. Moreover, our algorithm recovers the optimal solution exactly in the case where the number of corrupted points grows sublinearly in the dimension.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据