4.6 Article

HIGH-DIMENSIONAL ANALYSIS OF SEMIDEFINITE RELAXATIONS FOR SPARSE PRINCIPAL COMPONENTS

期刊

ANNALS OF STATISTICS
卷 37, 期 5B, 页码 2877-2921

出版社

INST MATHEMATICAL STATISTICS
DOI: 10.1214/08-AOS664

关键词

Principal component analysis; spectral analysis; spiked covariance ensembles; sparsity; high-dimensional statistics; convex relaxation; semidefinite programming; Wishart ensembles; random matrices

资金

  1. NSF [CAREER-CCF-05-45862 and, DMS-06-05165]
  2. Sloan Foundation Fellowship

向作者/读者索取更多资源

Principal component analysis (PCA) is a classical method for dimensionality reduction based oil extracting the dominant eigenvectors of the Sample covariance matrix. However, PCA is well known to behave poorly inw the large p, small n setting, in which the problem dimension p is comparable to or larger than the sample size n. This paper studies PICA ill this high-dimensional regime, but under the additional assumption that the maximal eigenvector is sparse, say, with at most k nonzero components. We consider a spiked covariance model in which a base matrix is perturbed by adding a k-sparse maximal eigenvector, and we analyze two computationally tractable methods for recovering the Support set of this maximal eigenvector, as follows: (a) a simple diagonal thresholding method. which transitions from success to failure as a function of the resealed sample size theta(dia)(n, p, k) = n/[k(2) log(p - k)]; and (b) a more sophisticated semidefinite programming (SDP) relaxation, which succeeds once the resealed sample Size theta(sdp)(n, p, k) = n/[k log(p - k)] is larger than a critical threshold. In addition, we prove that no method, including the best method which has exponential-time complexity, can Succeed in recovering the Support if the order parameter theta(sdp)(n, p, k) is below a threshold. Our results thus highlight,in interesting trade-oft between computational and statistical efficiency in high-dimensional inference.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据