4.6 Article

A tourist walk approach for internal and external outlier detection

Journal

NEUROCOMPUTING
Volume 393, Issue -, Pages 203-213

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2018.10.113

Keywords

Outlier; Internal outlier; Tourist walk; Memory size; Critical memory size; Attractor; Crossing-attractor

Funding

  1. Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior - Brasil (CAPES) - Finance Code [001]
  2. Sao Paulo State Research Foundation (FAPESP) [2015/50122-0]
  3. Brazilian National Council for Scientific and Technological Development (CNPq)

Ask authors/readers for more resources

Outlier detection is a fundamental task for knowledge discovery in data mining, especially in the Big Data era. It aims to detect data items that deviate from the general pattern of a given data set. In this paper, we present a new outlier detection technique using tourist walks starting from each data sample and varying the memory size. Specifically, a data sample gets a higher outlier score if it participates in few tourist walk attractors, while it gets a low score if it participates in a large number of attractors. Experimental results on artificial and real data sets show good performance of the proposed method. In comparison to classical outlier detection methods, the proposed one shows the following salient features: (1) It finds out outliers by identifying the structure of the input data set instead of considering only physical features, such as distance, similarity or density. (2) It can detect not only external outliers as classical methods do, but also internal outliers staying among various normal data groups. (3) By varying the memory size, the tourist walks can characterize both local and global structures of the data set. (4) A parallel implementation is quite convenient due to the nature of large amount of independent walking of the algorithm. (5) The proposed method is a deterministic technique. Therefore, only one run is sufficient, in contrast to stochastic techniques, which require many runs. Moreover, in this work, we find, for the first time, that tourist walks can generate complex attractors in various crossing shapes. Such complex attractors reveal data structures in more details. Consequently, it can improve the outlier detection performance. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available