4.8 Article

A common set of distinct features that characterize noncoding RNAs across multiple species

Journal

NUCLEIC ACIDS RESEARCH
Volume 43, Issue 1, Pages 104-114

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gku1316

Keywords

-

Funding

  1. National Key Basic Research Program [2012CB316503]
  2. National High-tech Research and Development Program of China [2014AA021103]
  3. National Natural Science Foundation of China [31271402, 31100601, 91019016, 31361163004]
  4. National Institutes of Health [HG001696, ES017166]
  5. Hong Kong Research Grants Council Early Career Scheme [419612]
  6. NATIONAL HUMAN GENOME RESEARCH INSTITUTE [R01HG001696] Funding Source: NIH RePORTER
  7. NATIONAL INSTITUTE OF ENVIRONMENTAL HEALTH SCIENCES [U01ES017166] Funding Source: NIH RePORTER

Ask authors/readers for more resources

To find signature features shared by various ncRNA sub-types and characterize novel ncRNAs, we have developed a method, RNAfeature, to investigate > 600 sets of genomic and epigenomic data with various evolutionary and biophysical scores. RNAfeature utilizes a fine-tuned intra-species wrapper algorithm that is followed by a novel feature selection strategy across species. It considers long distance effect of certain features (e.g. histone modification at the promoter region). We finally narrow down on 10 informative features (including sequences, structures, expression profiles and epigenetic signals). These features are complementary to each other and as a whole can accurately distinguish canonical ncRNAs from CDSs and UTRs (accuracies: > 92% in human, mouse, worm and fly). Moreover, the feature pattern is conserved across multiple species. For instance, the supervised 10-feature model derived from animal species can predict ncRNAs in Arabidopsis (accuracy: 82%). Subsequently, we integrate the 10 features to define a set of noncoding potential scores, which can identify, evaluate and characterize novel noncoding RNAs. The score covers all transcribed regions (including unconserved ncRNAs), without requiring assembly of the full-length transcripts. Importantly, the noncoding potential allows us to identify and characterize potential functional domains with feature patterns similar to canonical ncRNAs (e.g. tRNA, snRNA, miRNA, etc) on similar to 70% of human long ncRNAs (lncRNAs).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available