4.8 Article

Using reads to annotate the genome: influence of length, background distribution, and sequence errors on prediction capacity

期刊

NUCLEIC ACIDS RESEARCH
卷 37, 期 15, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkp492

关键词

-

资金

  1. French 'Ministere de l'Enseignement superieur et de la Recherche'
  2. 'La ligue regionale contre le Cancer' Languedoc Roussillon
  3. Universite de Montpellier 2
  4. ANR [BLAN07-1_185484]

向作者/读者索取更多资源

Ultra high-throughput sequencing is used to analyse the transcriptome or interactome at unprecedented depth on a genome-wide scale. These techniques yield short sequence reads that are then mapped on a genome sequence to predict putatively transcribed or protein-interacting regions. We argue that factors such as background distribution, sequence errors, and read length impact on the prediction capacity of sequence census experiments. Here we suggest a computational approach to measure these factors and analyse their influence on both transcriptomic and epigenomic assays. This investigation provides new clues on both methodological and biological issues. For instance, by analysing chromatin immunoprecipitation read sets, we estimate that 4.6% of reads are affected by SNPs. We show that, although the nucleotide error probability is low, it significantly increases with the position in the sequence. Choosing a read length above 19 bp practically eliminates the risk of finding irrelevant positions, while above 20 bp the number of uniquely mapped reads decreases. With our procedure, we obtain 0.6% false positives among genomic locations. Hence, even rare signatures should identify biologically relevant regions, if they are mapped on the genome. This indicates that digital transcriptomics may help to characterize the wealth of yet undiscovered, low-abundance transcripts.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Hardware & Architecture

Linking indexing data structures to de Bruijn graphs: Construction and update

Bastien Cazaux, Thierry Lecroq, Eric Rivals

JOURNAL OF COMPUTER AND SYSTEM SCIENCES (2019)

Article Mathematics, Applied

Improved online algorithms for jumbled matching

Sukhpal Singh Ghuman, Jorma Tarhio, Tamanna Chhabra

DISCRETE APPLIED MATHEMATICS (2020)

Article Biochemical Research Methods

AQUAPONY: visualization and interpretation of phylogeographic information on phylogenetic trees

Bastien Cazaux, Guillaume Castel, Eric Rivals

BIOINFORMATICS (2019)

Article Multidisciplinary Sciences

NF-Y controls fidelity of transcription initiation at gene promoters through maintenance of the nucleosome-depleted region

Andrew J. Oldfield, Telmo Henriques, Dhirendra Kumar, Adam B. Burkholder, Senthilkumar Cinghu, Damien Paulet, Brian D. Bennett, Pengyi Yang, Benjamin S. Scruggs, Christopher A. Lavender, Eric Rivals, Karen Adelman, Raja Jothi

NATURE COMMUNICATIONS (2019)

Article Computer Science, Information Systems

Hierarchical Overlap Graph

Bastien Cazaux, Eric Rivals

INFORMATION PROCESSING LETTERS (2020)

Article Biochemical Research Methods

PEWO: a collection of workflows to benchmark phylogenetic placement

Benjamin Linard, Nikolai Romashchenko, Fabio Pardi, Eric Rivals

BIOINFORMATICS (2020)

Article Biochemical Research Methods

Rapid screening and detection of inter-type viral recombinants using phylo-k-mers

Guillaume E. Scholz, Benjamin Linard, Nikolai Romashchenko, Eric Rivals, Fabio Pardi

BIOINFORMATICS (2020)

Article Multidisciplinary Sciences

FTO-mediated cytoplasmic m6Am demethylation adjusts stem-like properties in colorectal cancer cell

Sebastien Relier, Julie Ripoll, Helene Guillorit, Amandine Amalric, Cyrinne Achour, Florence Boissiere, Jerome Vialaret, Aurore Attina, Francoise Debart, Armelle Choquet, Francoise Macari, Virginie Marchand, Yuri Motorin, Emmanuelle Samalin, Jean-Jacques Vasseur, Julie Pannequin, Francesca Aguilo, Evelyne Lopez-Crapez, Christophe Hirtz, Eric Rivals, Amandine Bastide, Alexandre David

Summary: The demethylase FTO was shown to remove N6-methyladenosine (m6A) and N6, 2'-O-dimethyladenosine (m6A(m)) modifications on RNAs. Here the authors show that FTO impedes cancer stem cell-like abilities in colorectal cancer cells through its m6A(m) demethylase activity, not through internal m6A demethylase activity.

NATURE COMMUNICATIONS (2021)

Review Biochemistry & Molecular Biology

The multifaceted functions of the Fat mass and Obesity-associated protein (FTO) in normal and cancer cells

Sebastien Relier, Eric Rivals, Alexandre David

Summary: Over the past decade, mRNA modification has emerged as a new layer of gene expression regulation. FTO, as the first identified eraser of N6-methyladenosine (m6A) adducts, has attracted much attention in the field of epitranscriptomics. The contradictory studies on the regulatory role of FTO in gene expression may be attributed to its wide spectrum of substrates and RNA sequence preferences. This review focuses on current knowledge related to FTO function in healthy and cancer cells, emphasizing its divergent roles in different tissues and subcellular and molecular contexts.

RNA BIOLOGY (2022)

Article Chemistry, Analytical

Multivariate Analysis of RNA Chemistry Marks Uncovers Epitranscriptomics-Based Biomarker Signature for Adult Diffuse Glioma Diagnostics

S. Relier, A. Amalric, A. Attina, I. B. Koumare, V. Rigau, F. Burel Vandenbos, D. Fontaine, M. Baroncini, J. P. Hugnot, H. Duffau, L. Bauchet, C. Hirtz, E. Rivals, A. David

Summary: One of the main challenges in cancer management is the discovery of reliable biomarkers for decision-making and treatment outcome prediction. This study combines high-throughput molecular profiling technologies with statistical multivariate analysis to design a pipeline for identifying biomarker signatures that can guide precision medicine and improve disease diagnosis.

ANALYTICAL CHEMISTRY (2022)

Article Infectious Diseases

FT-GPI, a highly sensitive and accurate predictor of GPI-anchored proteins, reveals the composition and evolution of the GPI proteome in Plasmodium species

Lena M. M. Sauer, Rodrigo Canovas, Daniel Roche, Hosam Shams-Eldin, Patrice Ravel, Jacques Colinge, Ralph T. T. Schwarz, Choukri Ben Mamoun, Eric Rivals, Emmanuel Cornillot

Summary: Protozoan parasites attach specific and diverse proteins to their plasma membrane via a GPI anchor. The FT-GPI software can detect GPI-anchored proteins and identify new candidates for vaccines against malaria and other parasitic diseases.

MALARIA JOURNAL (2023)

Article Biochemical Research Methods

Physical modeling of ribosomes along messenger RNA: Estimating kinetic parameters from ribosome profiling experiments using a ballistic model

Carole Chevalier, Jerome Dorignac, Yahaya Ibrahim, Armelle Choquet, Alexandre David, Julie Ripoll, Eric Rivals, Frederic Geniet, Nils-Ole Walliser, John Palmeri, Andrea Parmeggiani, Jean-Charles Walter

Summary: Gene expression involves the synthesis of proteins from the information encoded on DNA, and translation of mRNA into amino acid sequences is one of the main steps in gene expression. Understanding the motion of ribosomes along mRNA is crucial for studying genetic expression. In this study, a new experimental and theoretical approach is proposed to obtain kinetic rates with better accuracy by categorizing mRNA based on the number of ribosomes and using ribo-sequencing techniques.

PLOS COMPUTATIONAL BIOLOGY (2023)

Article Biochemical Research Methods

dipwmsearch: a Python package for searching di-PWM motifs

Marie Mille, Julie Ripoll, Bastien Cazaux, Eric Rivals

Summary: This study proposes a Python package called dipwmsearch, which offers an original and efficient algorithm for searching for occurrences of dinucleotide PWMs in sequences. The package allows the enumeration of matching words and simultaneous searching in the sequence, even if it contains IUPAC codes. Users can easily install dipwmsearch via Pypi or conda, and they also have access to comprehensive documentation and executable scripts for using dinucleotide PWMs.

BIOINFORMATICS (2023)

Article Biochemical Research Methods

EPIK: precise and scalable evolutionary placement with informative k-mers

Nikolai Romashchenko, Benjamin Linard, Fabio Pardi, Eric Rivals

Summary: Motivation Phylogenetic placement is a method for analyzing massive collections of newly sequenced DNA using a high-quality reference tree. Alignment-free approaches based on phylo-k-mers have emerged to simplify the process, but are limited by data preprocessing and the large number of k-mers to consider. The authors propose a filtering method based on mutual information to select informative phylo-k-mers, improving efficiency at the cost of a slight loss in accuracy. They develop the tools IPK and EPIK, which outperform previous software and provide fast and accurate phylogenetic placement.

BIOINFORMATICS (2023)

Article Biochemical Research Methods

Computing Phylo-k-Mers

Nikolai Romashchenko, Benjamin Linard, Eric Rivals, Fabio Pardi

Summary: The paper introduces a method for computing phylo-k-mers based on the concept of phylogenetically-informative k-mers and proposes algorithms to solve the computational problem. In practice, this method can efficiently find k-mers with probabilities above a given threshold in a phylogenetic tree.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

暂无数据