4.0 Article Proceedings Paper

Prediction of protein-RNA residue-base contacts using two-dimensional conditional random field with the lasso

Journal

BMC SYSTEMS BIOLOGY
Volume 7, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/1752-0509-7-S2-S15

Keywords

-

Funding

  1. Grants-in-Aid for Scientific Research [24500361] Funding Source: KAKEN

Ask authors/readers for more resources

Background: To uncover molecular functions and networks in biological cellular systems, it is important to dissect interactions between proteins and RNAs. Many studies have been performed to investigate and analyze interactions between protein amino acid residues and RNA bases. In terms of interactions between residues in proteins, it is generally accepted that an amino acid residue at interacting sites has coevolved together with the partner residue in order to keep the interaction between residues in proteins. Based on this hypothesis, in our previous study to identify residue-residue contact pairs in interacting proteins, we made calculations of mutual information (M I) between amino acid residues from some multiple sequence alignment of homologous proteins, and combined it with a discriminative random field (DRF) approach, which is a special type of conditional random fields (CRFs) and has been proved useful for the purpose of extracting distinguishing areas from a photograph in the image processing field. Recently, the evolutionary correlation of interactions between residues and DNA bases has also been found in certain transcription factors and the DNA-binding sites. Results: In this paper, we employ more generic two-dimensional CRFs than such DRFs to predict interactions between protein amino acid residues and RNA bases. In addition, we introduce labels representing kinds of amino acids and bases as local features of a CRF. Furthermore, we examine the utility of L-1-norm regularization (lasso) for the CRF. For evaluation of our method, we use residue-base interactions between several Pfam domains and Rfam entries, conduct cross-validation, and calculate the average AUC (Area under ROC Curve) score. The results suggest that our CRF-based method using mutual information and labels with the lasso is useful for further improving the performance, especially provided that the features of CRF are successfully reduced by the lasso approach. Conclusions: We propose simple and generic two-dimensional CRF models using labels and mutual information with the lasso. Use of the CRF-based method in combination with the lasso is particularly useful for predicting the residue-base contacts in protein-RNA interactions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available