☆ 4.7 Article

Efficient Outlier Detection for High-Dimensional Data

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS (2018)

Journal

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS

Volume 48, Issue 12, Pages 2451-2461

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TSMC.2017.2718220

Keywords

Dimension reduction; high-dimensional data; k nearest neighbors (kNN); low-rank approximation; outlier detection

Funding

China 973 Program [2013CB329404]
National Natural Science Foundation of China [61572443, 61450001, 61672177, 61761130079]
National Key Research and Development Program of China [2016YFB1000905]
Key Research Program of the Chinese Academy of Sciences [KGZD-EW-T03]
ARC [DP130104090]
Shanghai Key Laboratory of Intelligent Information Processing [IIPL-2016-001]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

How to tackle high dimensionality of data effectively and efficiently is still a challenging issue in machine learning. Identifying anomalous objects from given data has a broad range of real-world applications. Although many classical outlier detection or ranking algorithms have been witnessed during the past years, the high-dimensional problem, as well as the size of neighborhood, in outlier detection have not yet attracted sufficient attention. The former may trigger the distance concentration problem that the distances of observations in high-dimensional space tend to be indiscernible, whereas the latter requires appropriate values for parameters, making models high complex and more sensitive. To partially circumvent these problems, especially the high dimensionality, we introduce a concept called local projection score (LPS) to represent deviation degree of an observation to its neighbors. The LPS is obtained from the neighborhood information by the technique of low-rank approximation. The observation with high LPS is a promising candidate of outlier in high probability. Based on this notion, we propose an efficient and effective outlier detection algorithm, which is also robust to the parameter k of k nearest neighbors. Extensive evaluation experiments conducted on twelve public real-world data sets with five popular outlier detection algorithms show that the performance of the proposed method is competitive and promising.

Efficient Outlier Detection for High-Dimensional Data

Journal

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Efficient Outlier Detection for High-Dimensional Data

Journal

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper