4.5 Article

An automatic three-way clustering method based on sample similarity

出版社

SPRINGER HEIDELBERG
DOI: 10.1007/s13042-020-01255-8

关键词

Three-way decisions; Three-way clustering; Sample similarity

资金

  1. National Natural Science Foundation of China [61773208, 61906090, 61876027, 71671086]
  2. Natural Science Foundation of Jiangsu Province [BK20191287]
  3. Natural Science Foundation of Anhui Province of China [1808085MF178]
  4. Fundamental Research Funds for the Central Universities [30920021131]
  5. China Postdoctoral Science Foundation [2018M632304]

向作者/读者索取更多资源

In this paper, an improved three-way clustering method is proposed, which introduces automatic selection of the optimal number of clusters and partition thresholds while retaining the advantages of traditional three-way clustering. Experimental results demonstrate the effectiveness of the proposed method.
The three-way clustering is an extension of traditional clustering by adding the concept of fringe region, which can effectively solve the problem of inaccurate decision-making caused by inaccurate information or insufficient data in traditional two-way clustering methods. The existing three-way clustering works often select the appropriate number of clusters and the thresholds for three-way partition according to subjective tuning. However, the method of fixing the number of clusters and the thresholds of the partition cannot automatically select the optimal number of clusters and partition thresholds for different data sets with different sizes and densities. To address the above problem, this paper proposed an improved three-way clustering method. First, we define the roughness degree by introducing the sample similarity to measure the uncertainty of the fringe region. Moreover, based on the roughness degree, we define a novel partitioning validity index to measure the clustering partitions and propose an automatic threshold selection method. Second, based on the concept of sample similarity, we introduce the intra-class similarity and the inter-class similarity to describe the quantitative change of the relationship between the sample and the clusters, and define a novel clustering validity index to measure the clustering performance under different numbers of clusters through the integration of the above two kinds of similarities. Furthermore, we propose an automatic cluster number selection method. Finally, we give an automatic three-way clustering approach by combining the proposed threshold selection method and the cluster number selection method. The comparison experiments demonstrate the effectiveness of our proposal.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据