4.6 Article

Performance Evaluation of Missing-Value Imputation Clustering Based on a Multivariate Gaussian Mixture Model

期刊

PLOS ONE
卷 11, 期 8, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0161112

关键词

-

资金

  1. National Natural Science Foundation of China [31000539, 31391632, 91535103]
  2. Priority Academic Program Development of Jiangsu Higher Education Institutions
  3. National High-tech RAMP
  4. D Program (863 Program) [2014AA10A601-5]
  5. Natural Science Foundations of Jiangsu Province [BK20150010]
  6. Natural Science Foundation of the Jiangsu Higher Education Institutions [14KJA210005]

向作者/读者索取更多资源

Background It is challenging to deal with mixture models when missing values occur in clustering datasets. Methods and Results We propose a dynamic clustering algorithm based on a multivariate Gaussian mixture model that efficiently imputes missing values to generate a pseudo-complete dataset. Parameters from different clusters and missing values are estimated according to the maximum likelihood implemented with an expectation-maximization algorithm, and multivariate individuals are clustered with Bayesian posterior probability. A simulation showed that our proposed method has a fast convergence speed and it accurately estimates missing values. Our proposed algorithm was further validated with Fisher's Iris dataset, the Yeast Cell-cycle Gene-expression dataset, and the CIFAR-10 images dataset. The results indicate that our algorithm offers highly accurate clustering, comparable to that using a complete dataset without missing values. Furthermore, our algorithm resulted in a lower misjudgment rate than both clustering algorithms with missing data deleted and with missing-value imputation by mean replacement. Conclusion We demonstrate that our missing-value imputation clustering algorithm is feasible and superior to both of these other clustering algorithms in certain situations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据