Article
Computer Science, Artificial Intelligence
Mustafa Jahangoshai Rezaee, Milad Eshkevari, Morteza Saberi, Omar Hussain
Summary: This paper introduces a game-based k-means (GBK-means) algorithm that competes cluster centers to attract data for more accurate clustering. Experimental results demonstrate the superiority of GBK-means over traditional clustering algorithms.
KNOWLEDGE-BASED SYSTEMS
(2021)
Article
Computer Science, Artificial Intelligence
Shuyin Xia, Daowan Peng, Deyu Meng, Changqing Zhang, Guoyin Wang, Elisabeth Giem, Wei Wei, Zizhong Chen
Summary: This paper presents a novel accelerated exact k-means algorithm called Ball k-means, which uses a ball to describe each cluster. The algorithm focuses on reducing the point-centroid distance computation by finding neighbor clusters for each cluster. It divides each cluster into stable and active areas, with the latter further divided into annular areas. The points in the stable area remain unchanged, while the points in each annular area are adjusted among a few neighbor clusters. The Ball k-means achieves higher performance and requires fewer distance calculations compared to other state-of-the-art accelerated exact bounded methods, making it a versatile replacement for the naive k-means algorithm.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Imran Khan, Zongwei Luo, Abdul Khalique Shaikh, Rachid Hedjam
Summary: In this paper, a new ensemble clustering method is proposed, which incorporates two new steps in the standard fuzzy k-means algorithm to determine the optimal number of input clusterings and the optimal number of clusters in each clustering. Experiments show that the proposed algorithm outperformed well-known clustering algorithms in real cancer gene expression profiles.
EXPERT SYSTEMS WITH APPLICATIONS
(2021)
Article
Computer Science, Artificial Intelligence
Carlo Baldassi
Summary: We introduce an evolutionary algorithm called recombinator-k-means for optimizing the highly nonconvex kmeans problem. Its defining feature is that its crossover step involves all the members of the current generation, stochastically recombining them with a repurposed variant of the k-means++ seeding algorithm. The recombination also uses a reweighting mechanism that realizes a progressively sharper stochastic selection policy and ensures that the population eventually coalesces into a single solution. We compare this scheme with a state-of-the-art alternative, a more standard genetic algorithm with deterministic pairwise-nearest-neighbor crossover and an elitist selection policy, of which we also provide an augmented and efficient implementation. Extensive tests on large and challenging datasets (both synthetic and real word) show that for fixed population sizes recombinator-k-means is generally superior in terms of the optimization objective, at the cost of a more expensive crossover step. When adjusting the population sizes of the two algorithms to match their running times, we find that for short times the (augmented) pairwise-nearest-neighbor method is always superior, while at longer times recombinator-k-means will match it and, on the most difficult examples, take over. We conclude that the reweighted whole-population recombination is more costly but generally better at escaping local minima Moreover, it is algorithmically simpler and more general (it could be applied even to k-medians or k-medoids, for example).
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
(2022)
Article
Multidisciplinary Sciences
Yilin Wan, Qi Xiong, Zhiwei Qiu, Yaohan Xie
Summary: This paper proposes a data clustering approach based on MCSSA, which initializes the centroids' positions using a memristive chaotic system and combines with the K-means algorithm for data clustering. Empirical research confirms the effectiveness and feasibility of this method.
Article
Engineering, Mechanical
Ricardo Manuel Arias Velasquez
Summary: The incorporation of Phasor Measurement Units (PMUs) has allowed the development of new tools for planning, failure analysis, and operation in Power Systems (PS), which increase safety and reliability. A methodology based on the k-means grouping technique has been developed in this research, dividing PS into control areas and identifying critical areas to avoid contingency stages.
ENGINEERING FAILURE ANALYSIS
(2021)
Article
Chemistry, Multidisciplinary
Hang Xin, Jingyun Zhang, Cuihong Yang, Yunyun Chen
Summary: This paper proposes an efficient method to analyze the inhomogeneity of 2D materials by combining Raman spectroscopy and unsupervised k-means clustering analysis. The method can reveal both the inhomogeneity and spatial distribution of 2D materials, and has been successfully applied to MoS2, WS2, and WSe2 samples.
Article
Multidisciplinary Sciences
Xuan Wang, Anyang Shen, Xin Hou, Lifeng Tan
Summary: This study focuses on the traditional fort-type settlements in Shaanxi, using quantitative research methods such as K-means clustering algorithm and correlation analysis to study their spatial distribution and cluster characteristics. The results show that these settlements can be divided into three categories, and their overall distribution exhibits multi-point aggregation. The innovative research approach can be applied to other settlement-related studies and also has implications for heritage conservation.
Article
Computer Science, Artificial Intelligence
Tianjiao Ni, Minghao Qiao, Zhili Chen, Shun Zhang, Hong Zhong
Summary: The paper introduces a novel differentially private k-means clustering algorithm, DP-KCCM, which improves the utility of clustering significantly by adding adaptive noise and merging clusters. The algorithm first generates initial centroids, adds adaptive noise, and further improves the utility by merging clusters.
Review
Computer Science, Information Systems
Abiodun M. Ikotun, Absalom E. Ezugwu, Laith Abualigah, Belal Abuhaija, Jia Heming
Summary: Advances in data collection techniques have enabled the accumulation of large quantities of data. The K-means algorithm, while popular, has challenges such as determining the number of clusters and detecting non-Euclidean shapes. Research efforts have been made to improve its performance and robustness.
INFORMATION SCIENCES
(2023)
Article
Computer Science, Information Systems
Ying Zhou
Summary: This paper studies news text clustering and proposes a news clustering algorithm based on improved K-Means. The algorithm is parallelized using the MapReduce programming model, and the results show that the parallelized TIM-K-Means algorithm has a good acceleration ratio and can meet the needs of processing massive data in the context of big data. The news clustering algorithm is of great significance in multidocument automatic summarization.
WIRELESS COMMUNICATIONS & MOBILE COMPUTING
(2022)
Article
Computer Science, Artificial Intelligence
Haize Hu, Jianxun Liu, Xiangping Zhang, Mengge Fang
Summary: In this paper, a novel k-means clustering algorithm based on Levy flight trajectory (Lk-means) is proposed to address the shortcomings of the traditional k-means algorithm. Experimental results show that LK-means algorithm outperforms other algorithms in terms of search results and distribution of cluster centroids, significantly improving the global search ability, big data processing capacity, and even distribution of cluster centroids of the K-means algorithm.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Information Systems
Qiumei Pu, Jingkai Gan, Lirong Qiu, Jiaxin Duan, Hui Wang
Summary: This article proposes an optimization algorithm HPSO which makes comprehensive improvements to the PSO algorithm and shows good performance in stability, clustering effectiveness, robustness, and global search ability in experimental results.
MULTIMEDIA TOOLS AND APPLICATIONS
(2022)
Article
Energy & Fuels
Jian Sun, Jian Xu, Deping Ke, Siyang Liao, Zaixun Ling
Summary: This paper proposes an improved two-steps k-means algorithm for cluster partition of Regional Integrated Energy System, aiming to effectively utilize distributed energy resources.
Article
Chemistry, Analytical
Mei Wu, Zhengliang Li, Jing Chen, Qiusha Min, Tao Lu
Summary: This paper proposes a dual cluster-head energy-efficient algorithm (DCK-LEACH) based on K-means and Canopy optimization for reducing the energy consumption of wireless sensor networks (WSN) and extending network lifetime. The algorithm achieves energy and load balance through clustering using dynamic Canopy algorithm and K-means algorithm, and cluster-head selection using a hierarchy.