☆ 4.7 Article

Self-adapted mixture distance measure for clustering uncertain data

KNOWLEDGE-BASED SYSTEMS (2017)

Journal

KNOWLEDGE-BASED SYSTEMS

Volume 126, Issue -, Pages 33-47

Publisher

ELSEVIER

DOI: 10.1016/j.knosys.2017.04.002

Keywords

Clustering; Uncertain data; Induced kernel distance; Jensen-Shannon divergence; Self-adapted mixture distance measure

Funding

National Science Foundation of China [61272374, 61300190, 61428202]
National High Technology Research and Development Program (863 Program) of China [2015AA015403]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Distance measure plays an important role in clustering uncertain data. However, existing distance measures for clustering uncertain data suffer from some issues. Geometric distance measure can not identify the difference between uncertain objects with different distributions heavily overlapping in locations. Probability distribution distance measure can not distinguish the difference between different pairs of completely separated uncertain objects. In this paper, we propose a self-adapted mixture distance measure for clustering uncertain data which considers the geometric distance and the probability distribution distance simultaneously, thus overcoming the issues in previous distance measures. The proposed distance measure consists of three parts: (1) The induced kernel distance: it can be used to measure the geometric distance between uncertain objects. (2) The Jensen-Shannon divergence: it can be used to measure the probability distribution distance between uncertain objects. (3) The self-adapted weight parameter: it can be used to adjust the importance degree of the induced kernel distance and the Jensen Shannon divergence according to the location overlapping information of the dataset. The proposed distance measure is symmetric, finite and parameter adaptive. Furthermore, we integrate the self-adapted mixture distance measure into the partition-based and density-based algorithms for clustering uncertain data. Extensive experimental results on synthetic datasets, real benchmark datasets and real world uncertain datasets show that our proposed distance measure outperforms the existing distance measures for clustering uncertain data. (C) 2017 Elsevier B.V. All rights reserved.

Self-adapted mixture distance measure for clustering uncertain data

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Self-adapted mixture distance measure for clustering uncertain data

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper