Journal
PROTEIN AND PEPTIDE LETTERS
Volume 18, Issue 6, Pages 609-617Publisher
BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/092986611795222777
Keywords
Amino acids component of position; auto-correlation function; beta-hairpin motif; hydropathy component of position; predicted secondary structure information; random forest algorithm
Categories
Funding
- National Natural Science Foundation of China [30960090]
- Natural Science Foundation of the Inner Mongolia of China [2009MS0111]
- university of Inner Mongolia of China
Ask authors/readers for more resources
A novel method is presented for predicting beta-hairpin motifs in protein sequences. That is Random Forest algorithm on the basis of the multi-characteristic parameters, which include amino acids component of position, hydropathy component of position, predicted secondary structure information and value of auto-correlation function. Firstly, the method is trained and tested on a set of 8,291 beta-hairpin motifs and 6,865 non-beta-hairpin motifs. The overall accuracy and Matthew's correlation coefficient achieve 82.2% and 0.64 using 5-fold cross-validation, while they achieve 81.7% and 0.63 using the independent test. Secondly, the method is also tested on a set of 4,884 beta-hairpin motifs and 4,310 non-hairpin motifs which is used in previous studies. The overall accuracy and Matthew's correlation coefficient achieve 80.9% and 0.61 for 5-fold cross-validation, while they achieve 80.6% and 0.60 for the independent test. Compared with the previous, the present result is better. Thirdly, 4,884 beta-hairpin motifs and 4,310 non-beta-hairpin motifs selected as the training set, and 8,291 beta-hairpin motifs and 6,865 non-beta-hairpin motifs selected as the independent testing set, the overall accuracy and Matthew's correlation coefficient achieve 81.5% and 0.63 with the independent test.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available