4.2 Article

Using Random Forest Algorithm to Predict β-Hairpin Motifs

Journal

PROTEIN AND PEPTIDE LETTERS
Volume 18, Issue 6, Pages 609-617

Publisher

BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/092986611795222777

Keywords

Amino acids component of position; auto-correlation function; beta-hairpin motif; hydropathy component of position; predicted secondary structure information; random forest algorithm

Funding

  1. National Natural Science Foundation of China [30960090]
  2. Natural Science Foundation of the Inner Mongolia of China [2009MS0111]
  3. university of Inner Mongolia of China

Ask authors/readers for more resources

A novel method is presented for predicting beta-hairpin motifs in protein sequences. That is Random Forest algorithm on the basis of the multi-characteristic parameters, which include amino acids component of position, hydropathy component of position, predicted secondary structure information and value of auto-correlation function. Firstly, the method is trained and tested on a set of 8,291 beta-hairpin motifs and 6,865 non-beta-hairpin motifs. The overall accuracy and Matthew's correlation coefficient achieve 82.2% and 0.64 using 5-fold cross-validation, while they achieve 81.7% and 0.63 using the independent test. Secondly, the method is also tested on a set of 4,884 beta-hairpin motifs and 4,310 non-hairpin motifs which is used in previous studies. The overall accuracy and Matthew's correlation coefficient achieve 80.9% and 0.61 for 5-fold cross-validation, while they achieve 80.6% and 0.60 for the independent test. Compared with the previous, the present result is better. Thirdly, 4,884 beta-hairpin motifs and 4,310 non-beta-hairpin motifs selected as the training set, and 8,291 beta-hairpin motifs and 6,865 non-beta-hairpin motifs selected as the independent testing set, the overall accuracy and Matthew's correlation coefficient achieve 81.5% and 0.63 with the independent test.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available