4.7 Article

An Effective and Efficient Algorithm for K-Means Clustering With New Formulation

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Volume 35, Issue 4, Pages 3433-3443

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2022.3155450

Keywords

Clustering; K-means; optimization; re-weighted

Ask authors/readers for more resources

K-means is a simple and popular clustering algorithm, widely used in machine learning research. It aims to find cluster centers and minimize the squared distances between samples and their nearest centers. In this paper, a novel K-means algorithm is proposed, which reformulates the objective function and introduces an efficient iterative re-weighted algorithm for optimization. The algorithm shows faster convergence rate compared to the Lloyd's algorithm, while maintaining the same computational complexity. Experimental results on benchmark datasets demonstrate the effectiveness and efficiency of the proposed algorithm.
K-means is one of the most simple and popular clustering algorithms, which implemented as a standard clustering method in most of machine learning researches. The goal of K-means clustering is finding a set of cluster centers and minimizing the sum of squared distances between each sample and its nearest clustering center. In this paper, we proposed a novel K-means clustering algorithm, which reformulate the classical K-Means objective function as a trace maximization problem and then replace it with a new formulation. The proposed algorithm does not need to calculate the cluster centers in each iteration and requires fewer additional intermediate variables during the optimization process. In addition, we proposed an efficient iterative re-weighted algorithm to solve the involved optimization problem and provided the corresponding convergence analysis. The proposed algorithm keeps a consistent computational complexity as Lloyd's algorithm, $\mathcal {O}(ndk)$O(ndk), but shows a faster convergence rate in experiments. Extensive experimental results on real world benchmark datasets show the effectiveness and efficiency of the proposed algorithm.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available