4.6 Article Proceedings Paper

Classification of Small GTPases with Hybrid Protein Features and Advanced Machine Learning Techniques

期刊

CURRENT BIOINFORMATICS
卷 13, 期 5, 页码 492-500

出版社

BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/1574893612666171121162552

关键词

Small GTPase; binary-class classification; feature vector; gradient boosting decision tree (GBDT); scikit-learn method; motif

资金

  1. Natural Science Foundation of Fujian Province of China [2016J01152]
  2. National Natural Science Foundation of China [61573235, 61272315, 61302139, 61370010]

向作者/读者索取更多资源

Objective: Small GTPase is an important molecular switch that plays an important role in numerous signaling transduction pathways, the aim is to explore its binary classification features with machine learning algorithms. Methods: The sequences including small GTPases and non small GTPases were clustered to remove similar entries, respectively. Then, they were divided into 10 datasets, each containing equal entries of small GTPases and non small GTPases. These datasets extracted three feature vectors that included188-dimensional(188D), 400D, and motif-based features (608D). The next step was classification based on easy-classify.py software in scikit-learn, which integrated 12 classifiers and finally discovered the conserved motifs by MEME suite. Results: The three best performed classifiers were logistic regression (LR), gradient boosting decision tree (GBDT), and bagging for 400D features, LibSVM, GBDT, and bagging for 188D features, and GBDT, bagging, and AdaBoost for 608D features, respectively. The top four classifiers were GBDT, bagging, LR, and AdaBoost according to commonly evaluated indices as a whole. GBDT obtained the highest area under the curve (AUC) value at 88.61%. The 400D features performed better than the 188D and 608D ones. Five conserved G-box motifs were discovered in the sequences of human small GTPases. Conclusion: This study provides the first description of GBDT algorithm performed best for small GTPases classification.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据