4.8 Article

Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts

Journal

NUCLEIC ACIDS RESEARCH
Volume 41, Issue 17, Pages -

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkt646

Keywords

-

Funding

  1. Training Program of the Major Research plan of the National Natural Science Foundation of China [91229120]
  2. International Science and Technology Cooperation Projects [2010DFA31840, 2010DFB33720]

Ask authors/readers for more resources

It is a challenge to classify protein-coding or non-coding transcripts, especially those re-constructed from high-throughput sequencing data of poorly annotated species. This study developed and evaluated a powerful signature tool, Coding-Non-Coding Index (CNCI), by profiling adjoining nucleotide triplets to effectively distinguish protein-coding and non-coding sequences independent of known annotations. CNCI is effective for classifying incomplete transcripts and sense-antisense pairs. The implementation of CNCI offered highly accurate classification of transcripts assembled from whole-transcriptome sequencing data in a cross-species manner, that demonstrated gene evolutionary divergence between vertebrates, and invertebrates, or between plants, and provided a long non-coding RNA catalog of orangutan. CNCI software is available at http://www.bioinfo.org/software/cnci.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available