☆ 4.2 Article Proceedings Paper

Computationally predicting protein-RNA interactions using only positive and unlabeled examples

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (2015)

期刊

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

卷 13, 期 3, 页码 -

出版社

IMPERIAL COLLEGE PRESS

DOI: 10.1142/S021972001541005X

关键词

Protein-RNA interactions; biased-SVM; prediction

类别

Biochemical Research Methods Computer Science, Interdisciplinary Applications Mathematical & Computational Biology

资金

China 863 Program [2012AA020403]
National Natural Science Foundation of China (NSFC) [61173118, 61272380]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Protein-RNA interactions (PRIs) are considerably important in a wide variety of cellular processes, ranging from transcriptional and post-transcriptional regulations of gene expression to the active defense of host against virus. With the development of high throughput technology, large amounts of PRI information is available for computationally predicting unknown PRIs. In recent years, a number of computational methods for predicting PRIs have been developed in the literature, which usually artificially construct negative samples based on verified nonredundant datasets of PRIs to train classifiers. However, such negative samples are not real negative samples, some even may be unknown positive samples. Consequently, the classifiers trained with such training datasets cannot achieve satisfactory prediction performance. In this paper, we propose a novel method PRIPU that employs biased-support vector machine (SVM) for predicting Protein-RNA Interactions using only Positive and Unlabeled examples. To the best of our knowledge, this is the first work that predicts PRIs using only positive and unlabeled samples. We first collect known PRIs as our benchmark datasets and extract sequence-based features to represent each PRI. To reduce the dimension of feature vectors for lowering computational cost, we select a subset of features by a filter-based feature selection method. Then, biased-SVM is employed to train prediction models with different PRI datasets. To evaluate the new method, we also propose a new performance measure called explicit positive recall (EPR), which is specifically suitable for the task of learning positive and unlabeled data. Experimental results over three datasets show that our method not only outperforms four existing methods, but also is able to predict unknown PRIs. Source code, datasets and related documents of PRIPU are available at: http://admis.fudan.edu.cn/projects/pripu.htm.

Computationally predicting protein-RNA interactions using only positive and unlabeled examples

期刊

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

出版社

IMPERIAL COLLEGE PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Computationally predicting protein-RNA interactions using only positive and unlabeled examples

期刊

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

出版社

IMPERIAL COLLEGE PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文