4.7 Article

Weighted samples based semi-supervised classification

Journal

APPLIED SOFT COMPUTING
Volume 79, Issue -, Pages 46-58

Publisher

ELSEVIER
DOI: 10.1016/j.asoc.2019.03.005

Keywords

Semi-supervised classification; Graph optimization; Weighted samples; Hard-to-cluster index

Funding

  1. Natural Science Foundation of China [61872300, 61741217, 61873214]
  2. Fundamental Research Funds for the Central Universities [XDJK2019B024]
  3. Natural Science Foundation of CQ CSTC [cstc2018jcyjAX0228, cstc2016jcyjA0351]
  4. Open Research Project of The Hubei Key Laboratory of Intelligent Geo-Information Processing, China [KLIGIP-2017A05]
  5. Chongqing Graduate Student Research Innovation Project, China [CYS18089]

Ask authors/readers for more resources

Graph-based semi-supervised classification (GSSC) takes labeled and unlabeled samples as vertices in a graph, and edge weights as the similarity between samples. Most GSSC methods handle each labeled sample as equally important in the graph, and they mainly focus on optimizing the graph to improve the performance. In fact, samples are not always evenly distributed. Labeled samples close to the decision boundary of different classes are generally more important than labeled samples far away from the boundary. To account for the different importances, we propose an approach called Weighted Samples based Semi-Supervised Classification (WS3C for short). WS3C firstly executes multiple clusterings on the dataset to explore the structure of samples and summarizes these clustering results. Second, it quantifies the hard-to-cluster index of each labeled sample with respect to other samples based on the summarized results and employs the index to weight that sample. Next, it constructs a graph whose edge weights are equal to the frequency of two samples grouped into the same clusters in multiple clusterings. After that, it performs semi-supervised classification based on the constructed graph and weighted samples. Empirical study on synthesized and real datasets demonstrates that assigning labeled samples with different weights significantly improves the accuracy than equally treating labeled samples. WS3C not only has better performance than other related comparing methods, but also is robust to the input parameters. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available