Article
Multidisciplinary Sciences
Juan J. Morrone, Tania Escalante, Gerardo Rodriguez-Tapia, Aylin Carmona, Marcelo Arana, Jorge D. Mercado-Gomez
Summary: This study provides a map and shapefile of the 57 biogeographic provinces in the Neotropical region. The provinces are recognized based on their endemic species, and their delimitation on the map takes into account climatic, geological, and biotic criteria. The provinces belong to different subregions and transition zones in the region.
ANAIS DA ACADEMIA BRASILEIRA DE CIENCIAS
(2022)
Article
Computer Science, Artificial Intelligence
Carlo Baldassi
Summary: We introduce an evolutionary algorithm called recombinator-k-means for optimizing the highly nonconvex kmeans problem. Its defining feature is that its crossover step involves all the members of the current generation, stochastically recombining them with a repurposed variant of the k-means++ seeding algorithm. The recombination also uses a reweighting mechanism that realizes a progressively sharper stochastic selection policy and ensures that the population eventually coalesces into a single solution. We compare this scheme with a state-of-the-art alternative, a more standard genetic algorithm with deterministic pairwise-nearest-neighbor crossover and an elitist selection policy, of which we also provide an augmented and efficient implementation. Extensive tests on large and challenging datasets (both synthetic and real word) show that for fixed population sizes recombinator-k-means is generally superior in terms of the optimization objective, at the cost of a more expensive crossover step. When adjusting the population sizes of the two algorithms to match their running times, we find that for short times the (augmented) pairwise-nearest-neighbor method is always superior, while at longer times recombinator-k-means will match it and, on the most difficult examples, take over. We conclude that the reweighted whole-population recombination is more costly but generally better at escaping local minima Moreover, it is algorithmically simpler and more general (it could be applied even to k-medians or k-medoids, for example).
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
(2022)
Article
Computer Science, Artificial Intelligence
Yi-Cheng Chen, Yen-Liang Chen, Jyun-Yun Lu
Summary: K-Means algorithm is one of the most famous and popular clustering algorithms in the world, known for its simple structure, easy implementation, high efficiency, and fast convergence speed. This article introduces an improvement to past variants of K-Means used in evolutionary clustering, considering both past and future clustering results, and extending K-Means to multiple cycles, resulting in more consistent, stable, and smooth clustering results.
EXPERT SYSTEMS WITH APPLICATIONS
(2021)
Article
Automation & Control Systems
Uri Stemmer
Summary: This research presents a new algorithm operating in the local model of differential privacy for solving the Euclidean k-means problem, significantly reducing additive error while maintaining multiplicative error. The study shows that the obtained additive error in handling the k-means objective is almost optimal in terms of its dependency on the database size.
JOURNAL OF MACHINE LEARNING RESEARCH
(2021)
Article
Environmental Sciences
Wenhao Zhao, Jin Ma, Qiyuan Liu, Jing Song, Mats Tysklind, Chengshuai Liu, Dong Wang, Yajing Qu, Yihang Wu, Fengchang Wu
Summary: The study found that soil attributes and their environmental drivers show different patterns and distinct regional characteristics in different geographical directions, which have important effects on substance migration and transformation and the environmental impacts of pollutants. However, there is no comprehensive evaluation or systematic classification of the natural soil environment in China.
ENVIRONMENTAL RESEARCH
(2023)
Article
Computer Science, Information Systems
Lin Yang, Xinming Li, Qinye Yang, Lei Zhang, Shujie Zhang, Shaohong Wu, Chenghu Zhou
Summary: This study proposes a method to extract knowledge from legacy area-class maps to formulate fuzzy membership functions for regionalization. The buffer zone approach effectively reduces uncertainties of boundaries between eco-region units on area-class maps, recommending a buffer distance of 10-15 km. The climatic zone map generated based on extracted fuzzy membership functions shows higher spatial stratification heterogeneity compared to the original map.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE
(2021)
Article
Public, Environmental & Occupational Health
Peter Lowenberg-Neto, Stephanie Winkelmann, Agatha K. Verzotto
Summary: The study aimed to determine a biogeographic regionalization for human infectious diseases in Brazil and investigate the hypotheses predicting the observed regions. Through analysis and model evaluation, the study found a discernible latitudinal pattern in the turnover of diseases in Brazil, which is associated with a complex interplay between contemporary climate, population activity, and land cover.
TROPICAL MEDICINE & INTERNATIONAL HEALTH
(2023)
Article
Computer Science, Interdisciplinary Applications
Ahmed Fahim
Summary: The k-means method divides N objects into k clusters based on mean values, with linear time complexity and dependence on knowing the number of clusters and initial centers. This research introduces a method able to detect near-optimal values for k and initial centers without prior knowledge, resulting in improved final result quality. The proposed method combines DBSCAN and k-means to converge to global minima and has a time complexity of o(n log n).
JOURNAL OF COMPUTATIONAL SCIENCE
(2021)
Article
Remote Sensing
Ruojing Zhang, Yuehong Chen, Xiaoxiang Zhang, Qiang Ma, Liliang Ren
Summary: This paper proposes a flash flood regionalization approach using machine learning algorithms and conducts a case study in Jiangxi province, China. The generated flash flood regionalization map consists of eighteen homogeneous regions. The results show that the map can provide a 77.31% determinant power for the spatial distribution of historical flash flood events, which is beneficial for future flash flood mitigation and prevention in Jiangxi province.
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION
(2022)
Article
Computer Science, Information Systems
Jing Liu, Fuyuan Cao, Jiye Liang
Summary: In this paper, a centroids-guided deep multi-view k-means clustering method is proposed, which incorporates deep representation learning into the multi-view k-means objective. The method produces more k-means-friendly representations by reducing the loss between each representation and its assigned cluster centroid.
INFORMATION SCIENCES
(2022)
Article
Computer Science, Artificial Intelligence
Hongfu Liu, Junxiang Chen, Jennifer Dy, Yun Fu
Summary: K-means is a widely used clustering algorithm known for its simplicity and efficiency. This review paper focuses on generalizing K-means to solve challenging and complex problems. It unifies the available approaches in terms of data representation, distance measure, label assignment, and centroid updating. Concrete applications of modified K-means formulations are reviewed, including iterative subspace projection and clustering, consensus clustering, constrained clustering, domain adaptation, and outlier detection.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Avgoustinos Vouros, Stephen Langdell, Mike Croucher, Eleni Vasilaki
Summary: K-Means is a widely used algorithm for data clustering, but it has limitations such as only finding local minima and being sensitive to initial centroid positions. Various K-Means variations and initialization techniques have been proposed, with more sophisticated techniques reducing the need for complex clustering methods. Deterministic methods generally outperform stochastic methods, but there is a trade-off where simpler stochastic methods run multiple times can result in better clustering.
Article
Computer Science, Artificial Intelligence
Luc Giffon, Valentin Emiya, Hachem Kadri, Liva Ralaivola
Summary: K-means algorithm and Lloyd's algorithm have expanded beyond their original clustering purposes to play pivotal roles in various machine learning and data analysis techniques. QuicK-means is an efficient extension of K-means that reduces computational complexity through sparse matrix products, demonstrating benefits through experimental results.
Article
Computer Science, Artificial Intelligence
Peter Olukanmi, Fulufhelo Nelwamondo, Tshilidzi Marwala
Summary: A key drawback of k-means algorithm is its susceptibility to local minima. The authors propose a technique for comparing initializations directly and selecting the best one based on the maximum minimum inter-center distance. The experiments and mathematical analysis show significant efficiency gains and improved accuracy compared to repeated k-means.
NEURAL COMPUTING & APPLICATIONS
(2022)
Article
Computer Science, Artificial Intelligence
Marco Capo, Aritz Perez, Jose A. Antonio
Summary: The K-means algorithm is a popular clustering method, but its performance depends heavily on the initialization phase. Researchers have developed various initialization techniques to address this issue. This article introduces a cost-effective Split-Merge step that can restart the K-means algorithm after reaching a fixed point, reducing error and computing fewer distances.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2022)
Article
Ecology
Tiago S. Vasconcelos, Bruno T. M. do Nascimento, Vitor H. M. Prado
ECOLOGY AND EVOLUTION
(2018)
Article
Multidisciplinary Sciences
Rafael Molina-Venegas, Sonia Llorente-Culebras, Paloma Ruiz-Benito, Miguel A. Rodriguez
Article
Biodiversity Conservation
Tiago S. Vasconcelos, Vitor H. M. Prado
JOURNAL FOR NATURE CONSERVATION
(2019)
Article
Biology
Joaquin Calatayud, Miguel Angel Rodriguez, Rafael Molina-Venegas, Maria Leo, Jose Luis Horreo, Joaquin Hortal
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES
(2019)
Article
Ecology
Ignacio Morales-Castilla, T. Jonathan Davies, Miguel A. Rodriguez
JOURNAL OF BIOGEOGRAPHY
(2020)
Article
Multidisciplinary Sciences
Helen R. P. Phillips, Carlos A. Guerra, Marie L. C. Bartz, Maria J. I. Briones, George Brown, Thomas W. Crowther, Olga Ferlian, Konstantin B. Gongalsky, Johan van den Hoogen, Julia Krebs, Alberto Orgiazzi, Devin Routh, Benjamin Schwarz, Elizabeth M. Bach, Joanne Bennett, Ulrich Brose, Thibaud Decaens, Birgitta Koenig-Ries, Michel Loreau, Jerome Mathieu, Christian Mulder, Wim H. van der Putten, Kelly S. Ramirez, Matthias C. Rillig, David Russell, Michiel Rutgers, Madhav P. Thakur, Franciska T. de Vries, Diana H. Wall, David A. Wardle, Miwa Arai, Fredrick O. Ayuke, Geoff H. Baker, Robin Beausejour, Jose C. Bedano, Klaus Birkhofer, Eric Blanchart, Bernd Blossey, Thomas Bolger, Robert L. Bradley, Mac A. Callaham, Yvan Capowiez, Mark E. Caulfield, Amy Choi, Felicity V. Crotty, Andrea Davalos, Dario J. Diaz Cosin, Anahi Dominguez, Andres Esteban Duhour, Nick van Eekeren, Christoph Emmerling, Liliana B. Falco, Rosa Fernandez, Steven J. Fonte, Carlos Fragoso, Andre L. C. Franco, Martine Fugere, Abegail T. Fusilero, Shaieste Gholami, Michael J. Gundale, Monica Gutierrez Lopez, Davorka K. Hackenberger, Luis M. Hernandez, Takuo Hishi, Andrew R. Holdsworth, Martin Holmstrup, Kristine N. Hopfensperger, Esperanza Huerta Lwanga, Veikko Huhta, Tunsisa T. Hurisso, Basil V. Iannone, Madalina Iordache, Monika Joschko, Nobuhiro Kaneko, Radoslava Kanianska, Aidan M. Keith, Courtland A. Kelly, Maria L. Kernecker, Jonatan Klaminder, Armand W. Kone, Yahya Kooch, Sanna T. Kukkonen, H. Lalthanzara, Daniel R. Lammel, Iurii M. Lebedev, Yiqing Li, Juan B. Jesus Lidon, Noa K. Lincoln, Scott R. Loss, Raphael Marichal, Radim Matula, Jan Hendrik Moos, Gerardo Moreno, Alejandro Moron-Rios, Bart Muys, Johan Neirynck, Lindsey Norgrove, Marta Novo, Visa Nuutinen, Victoria Nuzzo, Mujeeb P. Rahman, Johan Pansu, Shishir Paudel, Guenola Peres, Lorenzo Perez-Camacho, Raul Pineiro, Jean-Francois Ponge, Muhammad Imtiaz Rashid, Salvador Rebollo, Javier Rodeiro-Iglesias, Miguel A. Rodriguez, Alexander M. Roth, Guillaume X. Rousseau, Anna Rozen, Ehsan Sayad, Loes van Schaik, Bryant C. Scharenbroch, Michael Schirrmann, Olaf Schmidt, Boris Schroeder, Julia Seeber, Maxim P. Shashkov, Jaswinder Singh, Sandy M. Smith, Michael Steinwandter, Jose A. Talavera, Dolores Trigo, Jiro Tsukamoto, Anne W. de Valenca, Steven J. Vanek, Inigo Virto, Adrian A. Wackett, Matthew W. Warren, Nathaniel H. Wehr, Joann K. Whalen, Michael B. Wironen, Volkmar Wolters, Irina V. Zenkova, Weixin Zhang, Erin K. Cameron, Nico Eisenhauer
Article
Biodiversity Conservation
Luciano Pataro, Rafael Molina-Venegas, Joaquin Calatayud, Juan Carlos Moreno-Saiz, Miguel A. Rodriguez
Summary: Classical bioregionalizations may have underestimated the importance of historical factors in shaping biogeographic regions, particularly for lineages with long evolutionary histories like ferns. A new method based on phylogenetic relatedness was used to define six distinct fern phyloregions in Europe, revealing a primary divide between northeastern and southwestern Europe. The study highlights the preference of ancient fern lineages for northern latitudes and sheds light on the evolutionary history of the group, providing a fresh regional delineation.
BIODIVERSITY AND CONSERVATION
(2021)
Article
Multidisciplinary Sciences
Alje van Dam, Mark Dekker, Ignacio Morales-Castilla, Miguel A. Rodriguez, David Wichmann, Mara Baudena
Summary: This article reexamines correspondence analysis (CA) as a classical method for revealing structures in high-dimensional data from a network perspective. The poorly-known equivalence of CA to spectral clustering and graph embedding techniques is presented. The multiple interpretations of CA results, beyond its traditional interpretation as an ordination technique, are discussed, with emphasis on their relation to the underlying structure of networks.
SCIENTIFIC REPORTS
(2021)
Article
Ecology
Rafael Molina-Venegas, Miguel A. Rodriguez, Manuel Pardo-de-Santayana, Cristina Ronquillo, David J. Mabberley
Summary: The divergent nature of evolution may require relying on different lineages of the Tree of Life to secure human benefits directly provided by biodiversity, with quantitative evidence still lacking. However, a global review of plant-use records shows that maximum levels of phylogenetic diversity capture significantly more plant-use records than random selection of taxa. This study establishes an empirical foundation linking evolutionary history to human wellbeing, serving as a basis for promoting well-grounded discussions on services directly provided by biodiversity.
NATURE ECOLOGY & EVOLUTION
(2021)
Article
Ecology
Ignacio Ramos-Gutierrez, Herlander Lima, Santiago Pajaron, Carlos Romero-Zarco, Llorenc Saez, Luciano Pataro, Rafael Molina-Venegas, Miguel A. Rodriguez, Juan Carlos Moreno-Saiz
Summary: The study compiled a comprehensive species list of the Iberian-Balearic terrestrial vascular flora and generated AFLIBER, an accurate floristic database of georeferenced plant occurrence records, totaling over 1.8 million plant occurrence records. The spatial scope covered the western Mediterranean with a resolution of 10 km UTM quadrangular grid cells, including inland territories of Spain, Portugal, Andorra, as well as adjacent archipelagos. The time period for the database consisted of distributional trustable records dating mostly from the 1960s onwards, focusing on terrestrial vascular plant species and subspecies.
GLOBAL ECOLOGY AND BIOGEOGRAPHY
(2021)
Article
Multidisciplinary Sciences
Rafael Molina-Venegas, Miguel Angel Rodriguez, Manuel Pardo-de-Santayana, David J. Mabberley
Summary: This study compiled plant-use records for 13489 genera based on information from Mabberley's plant-book. Plant uses were classified into 28 standard categories including human and animal nutrition, materials, fuels, medicine, poisons, social, and environmental uses. Of the taxa included, 33% were assigned to at least one category, with ornamental use being the most common (26%), followed by medicine (16%), human food (13%), and timber (8%).
Article
Multidisciplinary Sciences
Helen R. P. Phillips, Elizabeth M. Bach, Marie L. C. Bartz, Joanne M. Bennett, Remy Beugnon, Maria J. I. Briones, George G. Brown, Olga Ferlian, Konstantin B. Gongalsky, Carlos A. Guerra, Birgitta Koenig-Ries, Julia J. Krebs, Alberto Orgiazzi, Kelly S. Ramirez, David J. Russell, Benjamin Schwarz, Diana H. Wall, Ulrich Brose, Thibaud Decaens, Patrick Lavelle, Michel Loreau, Jerome Mathieu, Christian Mulder, Wim H. van der Putten, Matthias C. Rillig, Madhav P. Thakur, Franciska T. de Vries, David A. Wardle, Christian Ammer, Sabine Ammer, Miwa Arai, Fredrick O. Ayuke, Geoff H. Baker, Dilmar Baretta, Dietmar Barkusky, Robin Beausejour, Jose C. Bedano, Klaus Birkhofer, Eric Blanchart, Bernd Blossey, Thomas Bolger, Robert L. Bradley, Michel Brossard, James C. Burtis, Yvan Capowiez, Timothy R. Cavagnaro, Amy Choi, Julia Clause, Daniel Cluzeau, Anja Coors, Felicity V. Crotty, Jasmine M. Crumsey, Andrea Davalos, Dario J. Diaz Cosin, Annise M. Dobson, Anahi Dominguez, Andres Esteban Duhour, Nick van Eekeren, Christoph Emmerling, Liliana B. Falco, Rosa Fernandez, Steven J. Fonte, Carlos Fragoso, Andre L. C. Franco, Abegail Fusilero, Anna P. Geraskina, Shaieste Gholami, Grizelle Gonzalez, Michael J. Gundale, Monica Gutierrez Lopez, Branimir K. Hackenberger, Davorka K. Hackenberger, Luis M. Hernandez, Jeff R. Hirth, Takuo Hishi, Andrew R. Holdsworth, Martin Holmstrup, Kristine N. Hopfensperger, Esperanza Huerta Lwanga, Veikko Huhta, Tunsisa T. Hurisso, Basil V. Iannone, Madalina Iordache, Ulrich Irmler, Mari Ivask, Juan B. Jesus, Jodi L. Johnson-Maynard, Monika Joschko, Nobuhiro Kaneko, Radoslava Kanianska, Aidan M. Keith, Maria L. Kernecker, Armand W. Kone, Yahya Kooch, Sanna T. Kukkonen, H. Lalthanzara, Daniel R. Lammel, Iurii M. Lebedev, Edith Le Cadre, Noa K. Lincoln, Danilo Lopez-Hernandez, Scott R. Loss, Raphael Marichal, Radim Matula, Yukio Minamiya, Jan Hendrik Moos, Gerardo Moreno, Alejandro Moron-Rios, Hasegawa Motohiro, Bart Muys, Johan Neirynck, Lindsey Norgrove, Marta Novo, Visa Nuutinen, Victoria Nuzzo, P. Mujeeb Rahman, Johan Pansu, Shishir Paudel, Guenola Peres, Lorenzo Perez-Camacho, Jean-Francois Ponge, Joerg Prietzel, Irina B. Rapoport, Muhammad Imtiaz Rashid, Salvador Rebollo, Miguel A. Rodriguez, Alexander M. Roth, Guillaume X. Rousseau, Anna Rozen, Ehsan Sayad, Loes van Schaik, Bryant Scharenbroch, Michael Schirrmann, Olaf Schmidt, Boris Schroeder, Julia Seeber, Maxim P. Shashkov, Jaswinder Singh, Sandy M. Smith, Michael Steinwandter, Katalin Szlavecz, Jose Antonio Talavera, Dolores Trigo, Jiro Tsukamoto, Sheila Uribe-Lopez, Anne W. de Valenca, Inigo Virto, Adrian A. Wackett, Matthew W. Warren, Emily R. Webster, Nathaniel H. Wehr, Joann K. Whalen, Michael B. Wironen, Volkmar Wolters, Pengfei Wu, Irina V. Zenkova, Weixin Zhang, Erin K. Cameron, Nico Eisenhauer
Summary: Earthworms are important ecosystem engineers, but their diversity and distribution are not well known at large spatial scales. A global dataset with information on 10,840 sites and 184 species from 60 countries has been created to assist researchers in investigating a wide variety of pressing questions related to biodiversity.
Article
Ecology
Sonia Llorente-Culebras, Rafael Molina-Venegas, A. Marcia Barbosa, Silvia B. Carvalho, Miguel A. Rodriguez, Ana M. C. Santos
Summary: Protected areas are created to preserve biodiversity and act as refuges from human activities. This study focused on evaluating the representation of functional, phylogenetic, and taxonomic diversity of tetrapod assemblages in national and natural parks of the Iberian Peninsula. The results suggest that these parks effectively capture the diversity components of most tetrapod assemblages present at the regional level.
FRONTIERS IN ECOLOGY AND EVOLUTION
(2021)
Letter
Plant Sciences
Rafael Molina-Venegas, Ignacio Morales-Castilla, Miguel A. Rodriguez
Article
Multidisciplinary Sciences
Bruno S. Souza, Bruna B. Della Coletta, Tiago S. Vasconcelos
Summary: This study compared the distribution patterns of amphibian beta diversity in the Atlantic Forest and Cerrado hotspots generated by three mapping methods, finding that the point-to-grid method showed the most divergent LCBD values. The extent-of-occurrence and ecological niche modelling methods produced similar beta diversity estimates for both hotspots. The turnover component was found to be more important than the nestedness component in all three mapping methods.
ANAIS DA ACADEMIA BRASILEIRA DE CIENCIAS
(2022)