4.4 Article

Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation

期刊

JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
卷 34, 期 10, 页码 1105-1116

出版社

SPRINGER
DOI: 10.1007/s10822-020-00323-z

关键词

Phage virion protein; Machine learning; Classification; Feature selection; Support vector machine; Meta-predictor

资金

  1. TRF Research Grant for New Scholar [MRG6180226]
  2. College of Arts, Media and Technology, Chiang Mai University
  3. TRF Research Career Development Grant from the Thailand Research Fund [RSA6280075]
  4. Office of Higher Education Commission
  5. Mahidol University

向作者/读者索取更多资源

Phage virion protein (PVP) perforate the host cell membrane and eventually culminates in cell rupture thereby releasing replicated phages. The accurate identification of PVP is thus a crucial step towards improving our understanding of the biological function and mechanisms of PVPs. Therefore, it is desirable to develop a computational method that is capable of fast and accurate identification of PVPs. To address this, we propose a novel sequence-based meta-predictor employing probabilistic information (referred herein as the Meta-iPVP) for the accurate identification of PVPs. Particularly, efficient feature representation approach was used to generate discriminative probabilistic features from four machine learning (ML) algorithms making use of seven feature encodings. To the best of our knowledge, the Meta-iPVP is the first meta-based approach that has been developed for PVP prediction. Independent test results indicated that the Meta-iPVP could discern important characteristics between PVPs and non-PVPs as well as achieving the best accuracy and MCC of 0.817 and 0.642, respectively, which corresponds to 6-10% and 14-21% improvements over existing PVP predictors. As such, this demonstrates that the proposed Meta-iPVP is a more efficient, robust and promising for the identification of PVPs. The predictive model is deployed as a publicly accessible Meta-iPVP webserver freely available online at.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

BERT6mA: prediction of DNA N6-methyladenine site using deep learning-based approaches

Sho Tsukiyama, Md Mehedi Hasan, Hong-Wen Deng, Hiroyuki Kurata

Summary: In this study, a novel approach called BERT6mA was proposed for detecting the 6mA modification in DNA, and its performance was evaluated and compared. Through pretraining and fine-tuning, BERT6mA showed high performance in prediction and achieved good results even in species with small sample sizes. Furthermore, the study analyzed the process of feature extraction by the BERT6mA model.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biology

DeepDNAbP: A deep learning-based hybrid approach to improve the identification of deoxyribonucleic acid-binding proteins

Md Faruk Hosen, S. M. Hasan Mahmud, Kawsar Ahmed, Wenyu Chen, Mohammad Ali Moni, Hong-Wen Deng, Watshara Shoombuatong, Md Mehedi Hasan

Summary: In this paper, a novel predictor called DeepDNAbP has been developed to accurately predict DNA-binding proteins (DBPs) using a convolutional neural network model. The predictor achieves superior performance in cross-validation tests and outperforms existing methods, making it a powerful computational resource for predicting DBPs.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Article Multidisciplinary Sciences

AMYPred-FRL is a novel approach for accurate prediction of amyloid proteins by using feature representation learning

Phasit Charoenkwan, Saeed Ahmed, Chanin Nantasenamat, Julian M. W. Quinn, Mohammad Ali Moni, Pietro Lio, Watshara Shoombuatong

Summary: This study presents a novel meta-predictor, AMYPred-FRL, which utilizes a feature representation learning approach to identify amyloid proteins more accurately. By combining multiple machine learning algorithms and sequence-based feature descriptors, AMYPred-FRL generates 60 probabilistic features and forms a hybrid model. Through cross-validation and independent tests, AMYPred-FRL outperforms existing methods in predictive performance.

SCIENTIFIC REPORTS (2022)

Article Biology

SAPPHIRE: A stacking-based ensemble learning framework for accurate prediction of thermophilic proteins

Phasit Charoenkwan, Nalini Schaduangrat, Mohammad Ali Moni, Pietro Lio, Balachandran Manavalan, Watshara Shoombuatong

Summary: This study presents a novel computational method, SAPPHIRE, for accurately identifying thermophilic proteins (TPPs) using sequence information. The method combines different feature encodings and machine learning algorithms to train baseline models and extract key information of TPPs. SAPPHIRE outperforms existing methods in terms of predictive performance and achieves higher accuracy and correlation coefficient.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Article Biology

NEPTUNE: A novel computational approach for accurate and large-scale identification of tumor homing peptides

Phasit Charoenkwan, Nalini Schaduangrat, Pietro Lio, Mohammad Ali Moni, Balachandran Manavalan, Watshara Shoombuatong

Summary: This study proposes a novel computational approach, NEPTUNE, for the accurate and large-scale identification of Tumor Homing Peptides (THPs) from sequence information. The results demonstrate that NEPTUNE achieves superior performance in THP prediction and improves interpretability using the SHapley additive explanations method.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Article Biology

PSRTTCA: A new approach for improving the prediction and characterization of tumor T cell antigens using propensity score representation learning

Phasit Charoenkwan, Chonlatip Pipattanaboon, Chanin Nantasenamat, Md Mehedi Hasan, Mohammad Ali Moni, Pietro Lio, Watshara Shoombuatong

Summary: Despite existing cancer therapies, the development of new and effective treatments is necessary to address the ongoing cancer recurrence and new cases. This study proposes a new machine learning-based approach, PSRTTCA, for improving the identification and characterization of tumor T cell antigens (TTCAs) based on their primary sequences.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biochemistry & Molecular Biology

GPApred: The first computational predictor for identifying proteins with LPXTG-like motif using sequence-based optimal features

Adeel Malik, Watshara Shoombuatong, Chang-Bae Kim, Balachandran Manavalan

Summary: A machine learning-based predictor called GPApred was developed to identify LPXTG-like proteins from their primary sequences. This predictor can be utilized for functional characterization and drug targeting in further research.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Anatomy & Morphology

Morphometric analysis of dry atlas vertebrae in a northeastern Thai population and possible correlation with sex

Chanasorn Poodendan, Athikhun Suwannakhan, Tidarat Chawalchitiporn, Yuichi Kasai, Chanin Nantasenamat, Laphatrada Yurasakpong, Sitthichai Iamsaard, Arada Chaiyamoon

Summary: This study investigated the morphometric parameters of the C1 vertebra and evaluated its potential for sex prediction. The results showed that the C1 vertebra was longer in males compared to females. Evaluation of these parameters is important for preoperative assessment and treatment of atlas dislocation, and they can also be used for sex prediction.

SURGICAL AND RADIOLOGIC ANATOMY (2023)

Article Biology

PSRQSP: An effective approach for the interpretable prediction of quorum sensing peptide using propensity score representation learning

Phasit Charoenkwan, Pramote Chumnanpuen, Nalini Schaduangrat, Changmin Oh, Balachandran Manavalan, Watshara Shoombuatong

Summary: In this study, a novel computational approach called PSRQSP was developed to improve the prediction and analysis of QSPs. Experimental results showed that PSRQSP outperformed conventional methods in identifying QSPs and demonstrated its predictive capability and effectiveness. PSRQSP also constructed an easy-to-use web server for accelerating the discovery of potential QSPs for drug development.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biochemistry & Molecular Biology

Pretoria: An effective computational approach for accurate and high-throughput identification of CD8+t-cell epitopes of eukaryotic pathogens

Phasit Charoenkwan, Nalini Schaduangrat, Nhat Truong Pham, Balachandran Manavalan, Watshara Shoombuatong

Summary: Proposed the first stack-based approach, Pretoria, for accurate and large-scale identification of CD8+ T-cell epitopes (TCEs) of eukaryotic pathogens. Constructed a pool of 144 different machine learning (ML)-based classifiers based on 12 popular ML algorithms and used feature selection method to determine important ML classifiers for building the stacked model. Experimental results demonstrated that Pretoria outperformed several conventional ML classifiers and the existing method, with an accuracy of 0.866, MCC of 0.732, and AUC of 0.921 in the independent test.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Biochemistry & Molecular Biology

Exploring the Chemical Space of CYP17A1 Inhibitors Using Cheminformatics and Machine Learning

Tianshi Yu, Tianyang Huang, Leiye Yu, Chanin Nantasenamat, Nuttapat Anuwongcharoen, Theeraphon Piacham, Ruobing Ren, Ying-Chih Chiang

Summary: Researchers studied Cytochrome P450 17A1 (CYP17A1), a key enzyme in steroidogenesis, and its potential as a druggable target for anti-cancer molecule development. They used cheminformatic analyses and quantitative structure-activity relationship (QSAR) modeling on a dataset of CYP17A1 inhibitors. Different models were built for steroidal and nonsteroidal inhibitors, achieving good accuracy. The findings provide valuable insights for further drug discovery efforts targeting CYP17A1 inhibitors.

MOLECULES (2023)

Article Chemistry, Multidisciplinary

DeepAR: a novel deep learning-based hybrid framework for the interpretable prediction of androgen receptor antagonists

Nalini Schaduangrat, Nuttapat Anuwongcharoen, Phasit Charoenkwan, Watshara Shoombuatong

Summary: This study proposes a novel deep learning (DL)-based hybrid framework, named DeepAR, to accurately and rapidly identify AR antagonists by using only the SMILES notation. Experimental results indicate that DeepAR is a more accurate and stable approach for identifying AR antagonists, achieving an accuracy of 0.911 and MCC of 0.823 on an independent test dataset. In addition, the framework provides feature importance information and allows for characterization and analysis of potential AR antagonist candidates.

JOURNAL OF CHEMINFORMATICS (2023)

Article Chemistry, Multidisciplinary

Cheminformatic Analysis and Machine Learning Modeling to Investigate Androgen Receptor Antagonists to Combat Prostate Cancer

Tianshi Yu, Chanin Nantasenamat, Supicha Kachenton, Nuttapat Anuwongcharoen, Theeraphon Piacham

Summary: This study used cheminformatic analysis and machine learning modeling to investigate the chemical space, scaffolds, structure-activity relationship, and landscape of human androgen receptor antagonists. The findings revealed differences in physicochemical properties between potent/active class molecules and intermediate/inactive class molecules. Low scaffold diversity was observed, especially in the potent/active class molecules, indicating the need for developing molecules with novel scaffolds. The study also identified significant activity cliff generators and provided insights and guidelines for the development of novel androgen receptor antagonists.

ACS OMEGA (2023)

Article Multidisciplinary Sciences

TROLLOPE: A novel sequence-based stacked approach for the accelerated discovery of linear T-cell epitopes of hepatitis C virus

Phasit Charoenkwan, Sajee Waramit, Pramote Chumnanpuen, Nalini Schaduangrat, Watshara Shoombuatong

Summary: HCV infection causes chronic liver diseases, and there is no effective vaccine available. This study proposes a novel approach called TROLLOPE to accurately identify TCE-HCVs from sequence information, with superior predictive performance.

PLOS ONE (2023)

暂无数据