4.7 Article

Spectral clustering algorithm using density-sensitive distance measure with global and local consistencies

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 170, Issue -, Pages 26-42

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2019.01.026

Keywords

Spectral clustering; Euclidean distance; Relative density sensitive term; Global and local consistencies; Robustness

Funding

  1. Fundamental Research Funds for the Central Universities [2572017EB02, 2572017CB07]
  2. Innovative talent fund of Harbin science and technology Bureau [2017RAXXJ018]
  3. Double first-class scientific research foundation of Northeast Forestry University [411112438]

Ask authors/readers for more resources

Spectral clustering algorithm (SC) has recently received great attention for its high performance in large-scale data clustering and simplicity in implementation. However, previous studies have demonstrated that the commonly used distance measures in SC cannot simultaneously consider global and local consistencies and are sensitive to various noises. As a result, the obtained similarity matrices are unable to capture the actual data structure and thus produce poor clustering results, especially when the data exhibits nonlinear and local manifold structures characteristics. In order to address those limitations, we present a spectral clustering algorithm using density-sensitive distance measure with global and local consistencies in this paper. in the presented algorithm, a novel manifold distance with exponential term and scaling factor is introduced as the pairwise similarity measure. By modifying its exponential term and scaling factor, we can flexibly adjust the ratio of the similarities between the data within the same manifold to the similarities between the data across different manifolds. This flexible adjustment is beneficial to the obtained similarity matrix more accurately reflecting the global and local consistencies of data structures. In addition, to eliminate the effect of noises on the clustering performance, we also incorporate the relative density sensitive term into the distance measure to take into account the local distribution characteristics of the data. Finally, to further improve clustering performance, we provide the SC-based k value determination method far k nearest neighbors (KNN) graph. in the experimental part, the effect of parameters on the performance of the proposed technique is discussed and some suggestions about the determination of the parameters are given. Theoretical analysis and experimental results on several synthetic datasets. UCI benchmark datasets and generated large MNIST handwritten digits datasets demonstrate that the proposed approach is superior to other existing spectral clustering techniques with good robustness. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available