Article
Computer Science, Artificial Intelligence
Bo Li, Wei Liang, Shengmei Yang, Lixin Zhang
Summary: This paper proposes a novel method combining stochastic subspace identification (SSI), sparrow search algorithm (SSA), and K-means algorithm to automatically identify the modal parameters of high arch dams. The proposed method effectively eliminates the influence of unstable damping ratio and false modes through improved stabilization diagram and outlier detection techniques. The results demonstrate that the proposed method outperforms other methods in terms of accuracy and efficiency.
APPLIED SOFT COMPUTING
(2023)
Article
Biochemistry & Molecular Biology
Valentina Rudenko, Eugene Korotkov
Summary: The study presents a method called MSHDTR for searching highly divergent tandem repeats in protein sequences, which is able to detect repeats that significantly evolved over time. By applying this method to the Swiss-Prot database, over 15,000 TR-containing amino acid sequences were identified and classified, and these results have been made accessible through a database accessible over the WWW.
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2021)
Article
Computer Science, Artificial Intelligence
Hoang-Le Minh, Thanh Sang-To, Magd Abdel Wahab, Thanh Cuong-Le
Summary: This paper introduces a new metaheuristic optimization algorithm called K-means Optimizer (KO), which can effectively solve various optimization problems. The algorithm utilizes the K-means algorithm to establish centroid vectors and proposes two movement strategies for a balance between exploitation and exploration. The effectiveness and reliability of KO are demonstrated through experimental comparisons on benchmark functions and engineering problems.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Carlo Baldassi
Summary: We introduce an evolutionary algorithm called recombinator-k-means for optimizing the highly nonconvex kmeans problem. Its defining feature is that its crossover step involves all the members of the current generation, stochastically recombining them with a repurposed variant of the k-means++ seeding algorithm. The recombination also uses a reweighting mechanism that realizes a progressively sharper stochastic selection policy and ensures that the population eventually coalesces into a single solution. We compare this scheme with a state-of-the-art alternative, a more standard genetic algorithm with deterministic pairwise-nearest-neighbor crossover and an elitist selection policy, of which we also provide an augmented and efficient implementation. Extensive tests on large and challenging datasets (both synthetic and real word) show that for fixed population sizes recombinator-k-means is generally superior in terms of the optimization objective, at the cost of a more expensive crossover step. When adjusting the population sizes of the two algorithms to match their running times, we find that for short times the (augmented) pairwise-nearest-neighbor method is always superior, while at longer times recombinator-k-means will match it and, on the most difficult examples, take over. We conclude that the reweighted whole-population recombination is more costly but generally better at escaping local minima Moreover, it is algorithmically simpler and more general (it could be applied even to k-medians or k-medoids, for example).
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
(2022)
Article
Biochemistry & Molecular Biology
Owen K. Smith, Charles Limouse, Kelsey A. Fryer, Nicole A. Teran, Kousik Sundararajan, Rebecca Heald, Aaron F. Straight
Summary: The study successfully identified the repetitive sequences of Xenopus laevis centromeres using a combination of Cenpa ChIP-seq and k-mer analysis, and mapped the centromere positions on each chromosome through in situ hybridization and analysis of centromere-enriched k-mers distribution. This approach enables previously unapproachable centromere genomic studies and could be broadly applicable for analyzing repetitive sequences in any organism.
Article
Engineering, Marine
Shanshan Jin, Xunwei Nie, Guanlin Wang, Fei Teng, Tengfei Xu
Summary: Influenced by local mixing and coastal runoff, water masses in the South China Sea degenerate significantly. The K-means algorithm is used to classify the water masses based on WOD13 temperature and salinity observations from 1966 to 2013. The result shows that there are ten water masses in the South China Sea, and their properties and seasonal variabilities are analyzed.
JOURNAL OF MARINE SCIENCE AND ENGINEERING
(2023)
Article
Biochemistry & Molecular Biology
Mohamed Kamel, Kristina Kastano, Pablo Mier, Miguel A. Andrade-Navarro
Summary: This study presents a new web server REP2 for analyzing tandem repeats (TRs) in protein sequences, providing precomputed analyses for 78 UniProt reference proteomes. The data can be used for studying the evolution of TRs using comparative genomics.
JOURNAL OF MOLECULAR BIOLOGY
(2021)
Article
Biology
Reza Behboudi, Mostafa Nouri-Baygi, Mahmoud Naghibzadeh
Summary: The sequencing of eukaryotic genomes has revealed the prevalence of tandem repeats, which not only affect certain cellular processes but may also be associated with specific diseases and aid in solving criminal cases. The Rapid Perfect Tandem Repeat Finder (RPTRF) is a proposed method that efficiently detects perfect tandem repeats in genomic sequences by minimizing unnecessary character comparison processing and utilizing an interval tree for filtering. Experiments have shown that RPTRF is highly efficient in discovering perfect tandem repeats and even outperforms some tools designed for detecting imperfect tandem repeats. The tool and its usage instructions are available on GitHub.
Article
Computer Science, Artificial Intelligence
Mustafa Jahangoshai Rezaee, Milad Eshkevari, Morteza Saberi, Omar Hussain
Summary: This paper introduces a game-based k-means (GBK-means) algorithm that competes cluster centers to attract data for more accurate clustering. Experimental results demonstrate the superiority of GBK-means over traditional clustering algorithms.
KNOWLEDGE-BASED SYSTEMS
(2021)
Article
Genetics & Heredity
Valentina Rudenko, Eugene Korotkov
Summary: In this study, the modified multiple alignment method based on random position weight matrices (RPWMs) was used to search for tandem repeats (TRs) in the Capsicum annuum genome. The application of the modified (m)RPWM method identified 908,072 TR regions with repeat lengths from 2 to 200 bp, accounting for approximately 29% of the genome. The mRPWM method outperformed other TR search methods in terms of detecting more TRs at similar false discovery rates.
Article
Automation & Control Systems
Dengxiu Yu, Hao Xu, C. L. Philip Chen, Wenjie Bai, Zhen Wang
Summary: This article proposes a dynamic coverage control method based on K-means, which relaxes the requirements on the coverage objects and calculates the optimal coverage positions of intelligent units. A control law based on discrete sliding mode control is designed to drive the units to the optimal positions for specified targets and areas.
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
(2022)
Article
Multidisciplinary Sciences
Yijia Li, Zhengfang Wang, Jing Wang, Qingmei Sui, Shufan Li, Hanpeng Wang, Zhiguo Cao
Summary: This study proposed a first arrival picking method for microseismic data with low SNR, consisting of feature selection and clustering. Comparative study results indicated the superiority of this method over traditional ones, with the smallest and most stable positioning errors for microseismic localization.
Article
Computer Science, Information Systems
Qiumei Pu, Jingkai Gan, Lirong Qiu, Jiaxin Duan, Hui Wang
Summary: This article proposes an optimization algorithm HPSO which makes comprehensive improvements to the PSO algorithm and shows good performance in stability, clustering effectiveness, robustness, and global search ability in experimental results.
MULTIMEDIA TOOLS AND APPLICATIONS
(2022)
Article
Energy & Fuels
Quan Ren, Hongbing Zhang, Dailu Zhang, Xiang Zhao, Lizhi Yan, Jianwen Rui
Summary: This study proposes a novel hybrid technique of lithology identification, combining fuzzy theory, decision tree, and K-means++ algorithm, to better overcome the ambiguity and uncertainty of logging data. The model achieved a prediction accuracy of 93.92% and outperformed other machine learning algorithms. This new approach provides a practical and effective model for complex lithology identification, offering a new idea for lithology identification.
JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING
(2022)
Article
Computer Science, Artificial Intelligence
Ioan-Daniel Borlea, Radu-Emil Precup, Alexandra-Bianca Borlea, Daniel Iercan
Summary: This paper introduces the novel Unified Form (UF) clustering algorithm and the Partitional Implementation of Unified Form (PIUF) algorithm, aiming to address the challenges of processing large datasets and sequential data processing. These algorithms are implemented and validated in the BigTim platform and can be applied to other data processing platforms.
KNOWLEDGE-BASED SYSTEMS
(2021)
Article
Biochemical Research Methods
Pablo Mier, Lisanna Paladin, Stella Tamana, Sophia Petrosian, Borbala Hajdu-Soltesz, Annika Urbanek, Aleksandra Gruca, Dariusz Plewczynski, Marcin Grynberg, Pau Bernado, Zoltan Gaspari, Christos A. Ouzounis, Vasilis J. Promponas, Andrey V. Kajava, John M. Hancock, Silvio C. E. Tosatto, Zsuzsanna Dosztanyi, Miguel A. Andrade-Navarro
BRIEFINGS IN BIOINFORMATICS
(2020)
Article
Biochemistry & Molecular Biology
Andras Hatos, Borbala Hajdu-Soltesz, Alexander M. Monzon, Nicolas Palopoli, Lucia Alvarez, Burcu Aykac-Fas, Claudio Bassot, Guillermo Benitez, Martina Bevilacqua, Anastasia Chasapi, Lucia Chemes, Norman E. Davey, Radoslav Davidovic, A. Keith Dunker, Arne Elofsson, Julien Gobeill, Nicolas S. Gonzalez Foutel, Govindarajan Sudha, Mainak Guharoy, Tamas Horvath, Valentin Iglesias, Andrey Kajava, Orsolya P. Kovacs, John Lamb, Matteo Lambrughi, Tamas Lazar, Jeremy Y. Leclercq, Emanuela Leonardi, Sandra Macedo-Ribeiro, Mauricio Macossay-Castillo, Emiliano Maiani, Jose A. Manso, Cristina Marino-Buslje, Elizabeth Martinez-Perez, Balint Meszaros, Ivan Micetic, Giovanni Minervini, Nikoletta Murvai, Marco Necci, Christos A. Ouzounis, Matyas Pajkos, Lisanna Paladin, Rita Pancsa, Elena Papaleo, Gustavo Parisi, Emilie Pasche, Pedro J. Barbosa Pereira, Vasilis J. Promponas, Jordi Pujols, Federica Quaglia, Patrick Ruch, Marco Salvatore, Eva Schad, Beata Szabo, Tamas Szaniszlo, Stella Tamana, Agnes Tantos, Nevena Veljkovic, Salvador Ventura, Wim Vranken, Zsuzsanna Dosztanyi, Peter Tompa, Silvio C. E. Tosatto, Damiano Piovesan
NUCLEIC ACIDS RESEARCH
(2020)
Article
Genetics & Heredity
Sonia Barbosa, Stephanie Greville-Heygate, Maxime Bonnet, Annie Godwin, Christine Fagotto-Kaufmann, Andrey Kajava, Damien Laouteouet, Rebecca Mawby, Htoo Aung Wai, Alexander J. M. Dingemans, Jayne Hehir-Kwa, Marjorlaine Willems, Yline Capri, Sarju G. Mehta, Helen Cox, David Goudie, Fleur Vansenne, Peter Turnpenny, Marie Vincent, Benjamin Cogne, Gaetan Lesca, Jozef Hertecant, Diana Rodriguez, Boris Keren, Lydie Burglen, Marion Gerard, Audrey Putoux, Vincent Cantagrel, Karine Siquier-Pernet, Marlene Rio, Siddharth Banka, Ajoy Sarkar, Marcie Steeves, Michael Parker, Emma Clement, Sebastien Moutton, Frederic Tran Mau-Them, Amelie Piton, Bert B. A. de Vries, Matthew Guille, Anne Debant, Susanne Schmidt, Diana Baralle
AMERICAN JOURNAL OF HUMAN GENETICS
(2020)
Article
Chemistry, Physical
Anna Sulatskaya, Stanislav A. Bondarev, Maksim Sulatsky, Nina P. Trubitsina, Mikhail Belousov, Galina A. Zhouravleva, Manuel A. Llanos, Andrey Kajava, Irina M. Kuznetsova, Konstantin K. Turoverov
JOURNAL OF MOLECULAR LIQUIDS
(2020)
Article
Biochemistry & Molecular Biology
Rafayel A. Azizyan, Weiqiang Wang, Alexey Anikeenko, Zinaida Radkova, Anastasia Bakulina, Adriana Garro, Landry Charlier, Christian Dumas, Salvador Ventura, Andrey Kajava
JOURNAL OF STRUCTURAL BIOLOGY
(2020)
Article
Medicine, General & Internal
Yvonne Sleiman, Monia Souidi, Ritu Kumar, Ellen Yang, Fabrice Jaffre, Ting Zhou, Albin Bernardin, Steve Reiken, Olivier Cazorla, Andrey V. Kajava, Adrien Morea, Jean-Luc Pasquie, Andrew R. Marks, Bruce B. Lerman, Shuibing Chen, Jim W. Cheung, Todd Evans, Alain Lacampagne, Albano C. Meli
Article
Biochemistry & Molecular Biology
Lisanna Paladin, Martina Bevilacqua, Sara Errigo, Damiano Piovesan, Ivan Micetic, Marco Necci, Alexander Miguel Monzon, Maria Laura Fabre, Jose Luis Lopez, Juliet F. Nilsson, Javier Rios, Pablo Lorenzano Menna, Maia Cabrera, Martin Gonzalez Buitron, Mariane Goncalves Kulik, Sebastian Fernandez-Alberti, Maria Silvina Fornasari, Gustavo Parisi, Antonio Lagares, Layla Hirsh, Miguel A. Andrade-Navarro, Andrey Kajava, Silvio C. E. Tosatto
Summary: RepeatsDB database provides annotations and classification for protein tandem repeat structures from PDB. The new version 3.0 addresses challenges of data growth and annotation needs by introducing a hierarchical classification scheme.
NUCLEIC ACIDS RESEARCH
(2021)
Article
Engineering, Chemical
Paco Pino, Joeri Kint, Divor Kiseljak, Valentina Agnolon, Giampietro Corradin, Andrey V. Kajava, Paolo Rovero, Ronald Dijkman, Gerco den Hartog, Jason S. McLellan, Patrick O. Byrne, Maria J. Wurm, Florian M. Wurm
Article
Biochemical Research Methods
Marco Necci, Damiano Piovesan, Silvio C. E. Tosatto
Summary: Intrinsically disordered proteins present a challenge to traditional protein structure-function analysis, with computational methods, particularly deep learning techniques, showing superior performance in predicting disorder. However, predicting disordered binding regions remains difficult, and there is a significant variation in computational times among methods.
Article
Multidisciplinary Sciences
Nikola Arsic, Tania Slatter, Gilles Gadea, Etienne Villain, Aurelie Fournet, Marina Kazantseva, Frederic Allemand, Nathalie Sibille, Martial Seveno, Sylvain de Rossi, Sunali Mehta, Serge Urbach, Jean-Christophe Bourdon, Pau Bernado, Andrey Kajava, Antony Braithwaite, Pierre Roux
Summary: The p53 isoform Delta 133p53 beta promotes intrinsic oncogenic functions, with its activity regulated through an aggregation-dependent mechanism. Interaction with partners like p63 family members or the CCT chaperone complex influences cancer cell features such as migration and invasion by modulating Delta 133p53 beta activity.
NATURE COMMUNICATIONS
(2021)
Article
Immunology
Myriam Arevalo-Herrera, Kazutoyo Miura, Nora Cespedes, Carlos Echeverry, Eduardo Solano, Angelica Castellanos, Juan Sebastian Ramirez, Adolfo Miranda, Andrey V. Kajava, Carole Long, Giampietro Corradin, Socrates Herrera
Summary: The P48/45 antigen, a crucial factor in Plasmodium parasite fertilization, was found to be more immunoreactive when expressed in Chinese Hamster Ovary (CHO) cells compared to Escherichia coli, showing potential for use in a protein vaccine. While there was an age-dependent increase in response to both antigens, specific IgG antibodies to CHO-rPvs48/45 demonstrated functional activity in inhibiting parasite transmission, suggesting promising prospects for further research.
FRONTIERS IN IMMUNOLOGY
(2021)
Article
Biochemistry & Molecular Biology
Edoardo Salladini, Frank Gondelaud, Juliet F. Nilsson, Giulia Pesce, Christophe Bignon, Maria Grazia Murrali, Roxane Fabre, Roberta Pierattelli, Andrey Kajava, Branka Horvat, Denis Gerlier, Cyrille Mathieu, Sonia Longhi
Summary: Henipaviruses are zoonotic pathogens responsible for severe encephalitis in humans. Their V protein plays a key role in immune evasion and has been shown to undergo a liquid-to-hydrogel phase transition. A specific region within the Hendra virus V protein, referred to as PNT3, forms amyloid-like fibrils, highlighting the potential importance of phase separation and fibrillation in the functional role of Henipavirus V proteins.
Article
Biochemistry & Molecular Biology
Zarifa Osmanli, Theo Falgarone, Turkan Samadova, Gudrun Aldrian, Jeremy Leclercq, Ilham Shahmuradov, Andrey Kajava
Summary: Alternative splicing is an important mechanism for generating protein diversity in cells. However, there is still a lack of structural data on alternative protein isoforms, as experimental studies typically focus on canonical proteins. In recent years, advances in bioinformatics tools and the development of the AlphaFold program have allowed for the modeling of high-confidence structures of isoforms. In this study, in silico analysis of 58 eukaryotic proteomes was performed, revealing differences in signal peptides, transmembrane regions, and tandem repeat regions between isoforms and canonical counterparts, potentially impacting protein function and cellular localization.
Article
Biochemistry & Molecular Biology
Juliet F. F. Nilsson, Hakima Baroudi, Frank Gondelaud, Giulia Pesce, Christophe Bignon, Denis Ptchelkine, Joseph Chamieh, Herve Cottet, Andrey V. V. Kajava, Sonia Longhi
Summary: The Nipah and Hendra viruses, classified as Henipaviruses, are highly dangerous human pathogens that counteract the host immune response. A recent study found that a short region within the shared N-terminal domain (PNT3) of the Phosphoprotein (P protein) can form amyloid-like structures. This study evaluated the role of specific tyrosine residues within this region in fibrillation. Results showed that removal of a single tyrosine significantly decreases fibril formation, mainly affecting the elongation phase, and the C-terminal half of PNT3 inhibits fibril formation. The study sheds light on the molecular mechanisms of fibril formation.
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2023)
Article
Biochemistry & Molecular Biology
Polina B. Drozdova, Yury A. Barbitoff, Mikhail V. Belousov, Rostislav K. Skitchenko, Tatyana M. Rogoza, Jeremy Y. Leclercq, Andrey V. Kajava, Andrew G. Matveenko, Galina A. Zhouravleva, Stanislav A. Bondarev