Article
Multidisciplinary Sciences
Mohamed Awad, Xiangchao Gan
Summary: This paper introduces GALA, a computational framework for chromosome-based sequencing data separation and gap-free de novo assembly. It allows integration of different data sources and addresses the challenge of achieving gap-free chromosome-scale assemblies using current workflows for long-read platforms. The method is demonstrated through the assembly of various genomes.
NATURE COMMUNICATIONS
(2023)
Article
Biochemistry & Molecular Biology
Shunhua Han, Guilherme B. Dias, Preston J. Basting, Raghuvir Viswanatha, Norbert Perrimon, Casey M. Bergman
Summary: Animal cell lines often undergo extreme genome restructuring events that hinder de novo whole-genome assembly. This study used long-read and linked-read technologies to sequence the genome of a tetraploid Drosophila cell line and developed a novel method called TELR for TE analysis. The results shed light on the role and mechanism of transposable elements in animal cell culture genome evolution.
NUCLEIC ACIDS RESEARCH
(2022)
Article
Plant Sciences
Ante Turudic, Zlatko Liber, Martina Grdisa, Jernej Jakse, Filip Varga, Zlatko Satovic
Summary: The development of bioinformatic solutions requires biological knowledge and often makes assumptions. In this study, we investigated the relationship between chloroplast sequence lengths and taxonomic proximity of species using RefSeq sequences from the asterid and rosid clades. We found that chloroplast length distributions are narrow at the family and genus levels, with outliers indicating possible inaccuracies in sequence assembly.
Article
Biochemical Research Methods
Nicholas J. Dimonaco, Wayne Aubrey, Kim Kenobi, Amanda Clare, Christopher J. Creevey
Summary: This article presents an evaluation framework for assessing the performance of CDS prediction tools based on a comprehensive set of primary and secondary metrics. The research found that no individual tool ranked as the most accurate across all genomes or metrics analyzed, and even top-ranked tools produced conflicting gene collections.
Article
Biotechnology & Applied Microbiology
Marketa Nykrynova, Vojtech Barton, Matej Bezdicek, Martina Lengerova, Helena Skutkova
Summary: The study introduces a pipeline for identifying highly variable genomic fragments in unmapped reads through a modified hybrid assembly approach. These variable regions can be used in efficient laboratory methods for bacterial typing with high discriminatory power, such as mini-MLST, replacing expensive methods like MLST. Through this approach, infection monitoring can be carried out more rapidly.
Article
Multidisciplinary Sciences
Amelia D. Wallace, Thomas A. Sasani, Jordan Swanier, Brooke L. Gates, Jeff Greenland, Brent S. Pedersen, Katherine E. Varley, Aaron R. Quinlan
Summary: The study introduces a method called CaBagE for efficient and rapid target enrichment of large, structurally complex DNA targets. By leveraging the stable binding of Cas9 to its DNA target, desired fragments are protected from digestion, allowing for enrichment. Testing on five genomic targets showed that enrichment with CaBagE resulted in high coverage of target loci.
Article
Biotechnology & Applied Microbiology
Yannis Nevers, Natasha M. Glover, Christophe Dessimoz, Odile Lecompte
Summary: By comparing protein length distribution across 2326 species, it was found that proteins are slightly longer on average in eukaryotes than in bacteria or archaea, but the variation of length distribution across species is low, especially compared to other genomic features. Moreover, most cases of atypical protein length distribution appear to be due to artifactual gene annotation, suggesting that the actual variation of protein length distribution across species is even smaller.
Article
Veterinary Sciences
Wei Zhao, Jialei Shi, Yongxiu Yao, Hongxia Shao, Aijian Qin, Kun Qian
Summary: This study successfully isolated and molecularly characterized two strains of Chicken astrovirus (CAstV), and observed their effect on hatchability. The genetic analysis showed that these two strains had typical characteristics of avian astroviruses, with high similarity among Chinese strains and a common origin with strains from the UK.
FRONTIERS IN VETERINARY SCIENCE
(2022)
Article
Microbiology
Mengqi Sun, Feng Chen
Summary: The relative abundance of N4-like viruses in two temperate estuaries was assessed using four different methods, and it was found that N4-like viruses were of low abundance in these environments. The study also identified locally isolated N4-like virus species, and indicated that N4-like viruses may be more abundant in colder water. The importance of including local viral sequences in reference databases was highlighted.
ENVIRONMENTAL MICROBIOLOGY
(2022)
Article
Biochemical Research Methods
Alejandro Rubio, Juan Jimenez, Antonio J. Perez-Pulido
Summary: Bacterial genomes provide valuable data for understanding the complete set of genes of a species. By analyzing multiple bacterial strains, shared genes and strain-specific genes can be identified. However, current computational gene finders may miss some existing genes. This study estimated the selective pressure on genes in the Acinetobacter baumannii pangenome and found that most genes are under negative selection, but a subset showed values compatible with positive selection, which may be related to acquisition of new functions.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Multidisciplinary Sciences
Leonor Oliveira, Nicolas Chevrollier, Jean-Felix Dallery, Richard J. J. O'Connell, Marc-Henri Lebrun, Muriel Viaud, Olivier Lespinet
Summary: In this article, we introduce a new application called CustomProteinSearch (CusProSe), which is designed to help users search for proteins based on their domain composition. The application consists of two customizable tools, IterHMMBuild and ProSeCDA. IterHMMBuild allows for the iterative construction of Hidden Markov Model (HMM) profiles for specific protein sequences, while ProSeCDA scans a proteome using an HMM profile database and annotates identified proteins using user-defined rules. We successfully used CusProSe to identify genes encoding key enzyme families involved in secondary metabolism in fungal genomes, as well as characterize different sub-families of terpene synthases.
SCIENTIFIC REPORTS
(2023)
Article
Genetics & Heredity
Xiao Zhang, Wen Zhu, Huimin Sun, Yijie Ding, Li Liu
Summary: In this study, a comparative analysis was conducted to investigate the sequence preference and binding strength of anchor and non-anchor CTCF binding sites. A machine learning model based on CTCF binding intensity and DNA sequence was proposed to predict the formation of chromatin loop anchors. The accuracy of this model reached 0.8646, and it was found that the formation of loop anchor is mainly influenced by CTCF binding strength and binding pattern.
FRONTIERS IN GENETICS
(2023)
Article
Genetics & Heredity
Yinqiao Jian, Wenyuan Yan, Jianfei Xu, Shaoguang Duan, Guangcun Li, Liping Jin
Summary: This study analyzed the abundance and distribution of SSRs in four potato genomes, identifying a large number of polymorphic markers, with a focus on intergenic regions. The high-density potato SSR markers developed will facilitate genetic research and marker-pyramiding in potato breeding.
Article
Genetics & Heredity
Tranchant-Dubreuil Christine, Chenal Clothilde, Blaison Mathieu, Albar Laurence, Klein Valentin, Mariac Cedric, A. Wing Rod, Vigouroux Yves, Sabot Francois
Summary: FrangiPANe is a pipeline developed to build panreference using short reads through a map-then-assemble strategy. Applying it to 248 African rice genomes using an improved CG14 reference genome, we identified an average of 8 Mb of new sequences and 5290 new contigs per individual. The pipeline allows for the anchoring of new contigs within the reference genome and annotation of new genes. It simplifies the construction of a panreference and can be used for pangenome studies and selection detection.
NAR GENOMICS AND BIOINFORMATICS
(2023)
Article
Agriculture, Dairy & Animal Science
Zhanwei Zhuang, Jie Wu, Yibin Qiu, Donglin Ruan, Rongrong Ding, Cineng Xu, Shenping Zhou, Yuling Zhang, Yiyi Liu, Fucai Ma, Jifei Yang, Ying Sun, Enqin Zheng, Ming Yang, Gengyuan Cai, Jie Yang, Zhenfang Wu
Summary: In this study, whole genome sequence (WGS) data was used to evaluate the prediction accuracy of genomic best linear unbiased prediction (GBLUP) for meat quality in large-scale crossbred commercial pigs. The results showed that using WGS data for genomic prediction resulted in different accuracies for different meat quality traits, ranging from 0.08 to 0.47. The study also found that MultiBLUP outperformed GBLUP and yielded accuracy increases ranging from 17.39% to 75%. Furthermore, genotype imputation from 50K chip to WGS level showed a high concordance rate and correlation coefficient.
JOURNAL OF ANIMAL SCIENCE AND BIOTECHNOLOGY
(2023)
Article
Computer Science, Hardware & Architecture
Bastien Cazaux, Thierry Lecroq, Eric Rivals
JOURNAL OF COMPUTER AND SYSTEM SCIENCES
(2019)
Article
Mathematics, Applied
Sukhpal Singh Ghuman, Jorma Tarhio, Tamanna Chhabra
DISCRETE APPLIED MATHEMATICS
(2020)
Article
Biochemical Research Methods
Bastien Cazaux, Guillaume Castel, Eric Rivals
Article
Multidisciplinary Sciences
Andrew J. Oldfield, Telmo Henriques, Dhirendra Kumar, Adam B. Burkholder, Senthilkumar Cinghu, Damien Paulet, Brian D. Bennett, Pengyi Yang, Benjamin S. Scruggs, Christopher A. Lavender, Eric Rivals, Karen Adelman, Raja Jothi
NATURE COMMUNICATIONS
(2019)
Article
Computer Science, Information Systems
Bastien Cazaux, Eric Rivals
INFORMATION PROCESSING LETTERS
(2020)
Article
Biochemical Research Methods
Benjamin Linard, Nikolai Romashchenko, Fabio Pardi, Eric Rivals
Article
Biochemical Research Methods
Guillaume E. Scholz, Benjamin Linard, Nikolai Romashchenko, Eric Rivals, Fabio Pardi
Article
Multidisciplinary Sciences
Sebastien Relier, Julie Ripoll, Helene Guillorit, Amandine Amalric, Cyrinne Achour, Florence Boissiere, Jerome Vialaret, Aurore Attina, Francoise Debart, Armelle Choquet, Francoise Macari, Virginie Marchand, Yuri Motorin, Emmanuelle Samalin, Jean-Jacques Vasseur, Julie Pannequin, Francesca Aguilo, Evelyne Lopez-Crapez, Christophe Hirtz, Eric Rivals, Amandine Bastide, Alexandre David
Summary: The demethylase FTO was shown to remove N6-methyladenosine (m6A) and N6, 2'-O-dimethyladenosine (m6A(m)) modifications on RNAs. Here the authors show that FTO impedes cancer stem cell-like abilities in colorectal cancer cells through its m6A(m) demethylase activity, not through internal m6A demethylase activity.
NATURE COMMUNICATIONS
(2021)
Review
Biochemistry & Molecular Biology
Sebastien Relier, Eric Rivals, Alexandre David
Summary: Over the past decade, mRNA modification has emerged as a new layer of gene expression regulation. FTO, as the first identified eraser of N6-methyladenosine (m6A) adducts, has attracted much attention in the field of epitranscriptomics. The contradictory studies on the regulatory role of FTO in gene expression may be attributed to its wide spectrum of substrates and RNA sequence preferences. This review focuses on current knowledge related to FTO function in healthy and cancer cells, emphasizing its divergent roles in different tissues and subcellular and molecular contexts.
Article
Chemistry, Analytical
S. Relier, A. Amalric, A. Attina, I. B. Koumare, V. Rigau, F. Burel Vandenbos, D. Fontaine, M. Baroncini, J. P. Hugnot, H. Duffau, L. Bauchet, C. Hirtz, E. Rivals, A. David
Summary: One of the main challenges in cancer management is the discovery of reliable biomarkers for decision-making and treatment outcome prediction. This study combines high-throughput molecular profiling technologies with statistical multivariate analysis to design a pipeline for identifying biomarker signatures that can guide precision medicine and improve disease diagnosis.
ANALYTICAL CHEMISTRY
(2022)
Article
Infectious Diseases
Lena M. M. Sauer, Rodrigo Canovas, Daniel Roche, Hosam Shams-Eldin, Patrice Ravel, Jacques Colinge, Ralph T. T. Schwarz, Choukri Ben Mamoun, Eric Rivals, Emmanuel Cornillot
Summary: Protozoan parasites attach specific and diverse proteins to their plasma membrane via a GPI anchor. The FT-GPI software can detect GPI-anchored proteins and identify new candidates for vaccines against malaria and other parasitic diseases.
Article
Biochemical Research Methods
Carole Chevalier, Jerome Dorignac, Yahaya Ibrahim, Armelle Choquet, Alexandre David, Julie Ripoll, Eric Rivals, Frederic Geniet, Nils-Ole Walliser, John Palmeri, Andrea Parmeggiani, Jean-Charles Walter
Summary: Gene expression involves the synthesis of proteins from the information encoded on DNA, and translation of mRNA into amino acid sequences is one of the main steps in gene expression. Understanding the motion of ribosomes along mRNA is crucial for studying genetic expression. In this study, a new experimental and theoretical approach is proposed to obtain kinetic rates with better accuracy by categorizing mRNA based on the number of ribosomes and using ribo-sequencing techniques.
PLOS COMPUTATIONAL BIOLOGY
(2023)
Article
Biochemical Research Methods
Marie Mille, Julie Ripoll, Bastien Cazaux, Eric Rivals
Summary: This study proposes a Python package called dipwmsearch, which offers an original and efficient algorithm for searching for occurrences of dinucleotide PWMs in sequences. The package allows the enumeration of matching words and simultaneous searching in the sequence, even if it contains IUPAC codes. Users can easily install dipwmsearch via Pypi or conda, and they also have access to comprehensive documentation and executable scripts for using dinucleotide PWMs.
Article
Biochemical Research Methods
Nikolai Romashchenko, Benjamin Linard, Fabio Pardi, Eric Rivals
Summary: Motivation Phylogenetic placement is a method for analyzing massive collections of newly sequenced DNA using a high-quality reference tree. Alignment-free approaches based on phylo-k-mers have emerged to simplify the process, but are limited by data preprocessing and the large number of k-mers to consider. The authors propose a filtering method based on mutual information to select informative phylo-k-mers, improving efficiency at the cost of a slight loss in accuracy. They develop the tools IPK and EPIK, which outperform previous software and provide fast and accurate phylogenetic placement.
Article
Biochemical Research Methods
Nikolai Romashchenko, Benjamin Linard, Eric Rivals, Fabio Pardi
Summary: The paper introduces a method for computing phylo-k-mers based on the concept of phylogenetically-informative k-mers and proposes algorithms to solve the computational problem. In practice, this method can efficiently find k-mers with probabilities above a given threshold in a phylogenetic tree.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2023)