4.6 Article

Outlier Detection Using Nonconvex Penalized Regression

期刊

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
卷 106, 期 494, 页码 626-639

出版社

TAYLOR & FRANCIS INC
DOI: 10.1198/jasa.2011.tm10390

关键词

M-estimation; Robust regression; Sparsity; Thresholding

资金

  1. NSF [DMS-0604939, DMS-0906056]

向作者/读者索取更多资源

This article studies the outlier detection problem from the standpoint of penalized regression. In the regression model, we add one mean shift parameter for each of the n data points. We then apply a regularization favoring a sparse vector of mean shift parameters. The usual L-1 penalty yields a convex criterion, but fails to deliver a robust estimator. The L-1 penalty corresponds to soft thresholding. We introduce a thresholding (denoted by Theta) based iterative procedure for outlier detection (Theta-IPOD). A version based on hard thresholding correctly identifies outliers on some hard test problems. We describe the connection between Theta-IPOD and M-estimators. Our proposed method has one tuning parameter with which to both identify outliers and estimate regression coefficients. A data-dependent choice can be made based on the Bayes information criterion. The tuned Theta-IPOD shows outstanding performance in identifying outliers in various situations compared with other existing approaches. In addition, Theta-IPOD is much faster than iteratively reweighted least squares for large data, because each iteration costs at most O(np) (and sometimes much less), avoiding an O(np(2)) least squares estimate. This methodology can be extended to high-dimensional modeling with p >> n if both the coefficient vector and the outlier pattern are sparse.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据