4.3 Article

MLACP: machine-learning-based prediction of anticancer peptides

期刊

ONCOTARGET
卷 8, 期 44, 页码 77121-77136

出版社

IMPACT JOURNALS LLC
DOI: 10.18632/oncotarget.20365

关键词

anticancer peptides; hybrid model; machine-learning parameters; random forest; support vector machine

资金

  1. National Research Foundation (NRF) of Korea - Ministry of Education, Science and Technology [2015R1D1A1A09060192]
  2. National Research Foundation of Korea (NRF) - Ministry of Education, Science and Technology [2009-0093826]
  3. National Research Foundation of Korea (NRF) - Ministry of Science, ICT and Future Planning [2017R1A2B4010084]
  4. National Research Foundation of Korea (NRF) - Ministry of Science, ICT & Future Planning [2016M3C7A1904392]
  5. National Research Foundation of Korea [2017R1A2B4010084, 2016M3C7A1904392, 2015R1D1A1A09060192] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

向作者/读者索取更多资源

Cancer is the second leading cause of death globally, and use of therapeutic peptides to target and kill cancer cells has received considerable attention in recent years. Identification of anticancer peptides (ACPs) through wet-lab experimentation is expensive and often time consuming; therefore, development of an efficient computational method is essential to identify potential ACP candidates prior to in vitro experimentation. In this study, we developed support vector machine- and random forest-based machine-learning methods for the prediction of ACPs using the features calculated from the amino acid sequence, including amino acid composition, dipeptide composition, atomic composition, and physicochemical properties. We trained our methods using the Tyagi-B dataset and determined the machine parameters by 10-fold cross-validation. Furthermore, we evaluated the performance of our methods on two benchmarking datasets, with our results showing that the random forest-based method outperformed the existing methods with an average accuracy and Matthews correlation coefficient value of 88.7% and 0.78, respectively. To assist the scientific community, we also developed a publicly accessible web server at www.thegleelab.org/MLACP.html.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

SiameseCPP: a sequence-based Siamese network to predict cell -penetrating peptides by contrastive learning

Xin Zhang, Lesong Wei, Xiucai Ye, Kai Zhang, Saisai Teng, Zhongshen Li, Junru Jin, Minjae Kim, Tetsuya Sakurai, Lizhen Cui, Balachandran Manavalan, Leyi Wei

Summary: A novel deep learning framework SiameseCPP is proposed for automated prediction of cell-penetrating peptides (CPPs). SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network comprising a transformer and gated recurrent units. Comprehensive experiments demonstrate that SiameseCPP outperforms existing baseline models for CPP prediction and exhibits satisfactory generalization ability on other functional peptide datasets.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemistry & Molecular Biology

GPApred: The first computational predictor for identifying proteins with LPXTG-like motif using sequence-based optimal features

Adeel Malik, Watshara Shoombuatong, Chang-Bae Kim, Balachandran Manavalan

Summary: A machine learning-based predictor called GPApred was developed to identify LPXTG-like proteins from their primary sequences. This predictor can be utilized for functional characterization and drug targeting in further research.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Biochemical Research Methods

VirPipe: an easy-to-use and customizable pipeline for detecting viral genomes from Nanopore sequencing

Kijin Kim, Kyungmin Park, Seonghyeon Lee, Seung-Hwan Baek, Tae-Hun Lim, Jongwoo Kim, Balachandran Manavalan, Jin-Won Song, Won-Keun Kim

Summary: VirPipe is a new pipeline for detecting viral genomes from Nanopore or Illumina sequencing, with streamlined installation and customization.

BIOINFORMATICS (2023)

Article Biology

Computational prediction of protein folding rate using structural parameters and network centrality measures

Saraswathy Nithiyanandam, Vinoth Kumar Sangaraju, Balachandran Manavalan, Gwang Lee

Summary: Protein folding is a complex process where a polymer of amino acids transitions from an unfolded state to a unique three-dimensional structure. Previous studies have identified structural parameters and examined their relationship with protein folding rate, but these parameters are only applicable to a limited set of proteins. Machine learning models have been proposed, but they fail to explain plausible folding mechanisms. In this study, ten different machine learning algorithms were evaluated using various structural parameters and network centrality measures, with support vector machine showing the best predictive capability.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biology

PSRQSP: An effective approach for the interpretable prediction of quorum sensing peptide using propensity score representation learning

Phasit Charoenkwan, Pramote Chumnanpuen, Nalini Schaduangrat, Changmin Oh, Balachandran Manavalan, Watshara Shoombuatong

Summary: In this study, a novel computational approach called PSRQSP was developed to improve the prediction and analysis of QSPs. Experimental results showed that PSRQSP outperformed conventional methods in identifying QSPs and demonstrated its predictive capability and effectiveness. PSRQSP also constructed an easy-to-use web server for accelerating the discovery of potential QSPs for drug development.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biochemistry & Molecular Biology

Pretoria: An effective computational approach for accurate and high-throughput identification of CD8+t-cell epitopes of eukaryotic pathogens

Phasit Charoenkwan, Nalini Schaduangrat, Nhat Truong Pham, Balachandran Manavalan, Watshara Shoombuatong

Summary: Proposed the first stack-based approach, Pretoria, for accurate and large-scale identification of CD8+ T-cell epitopes (TCEs) of eukaryotic pathogens. Constructed a pool of 144 different machine learning (ML)-based classifiers based on 12 popular ML algorithms and used feature selection method to determine important ML classifiers for building the stacked model. Experimental results demonstrated that Pretoria outperformed several conventional ML classifiers and the existing method, with an accuracy of 0.866, MCC of 0.732, and AUC of 0.921 in the independent test.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Biochemistry & Molecular Biology

PRR-HyPred: A two-layer hybrid framework to predict pattern recognition receptors and their families by employing sequence encoded optimal features

Ahmad Firoz, Adeel Malik, Hani Mohammed Ali, Yusuf Akhter, Balachandran Manavalan, Chang-Bae Kim

Summary: In this study, a new two-layer hybrid framework called PRR-HyPred was constructed to simultaneously predict and classify PRRs. Using support vector machine and random forest-based classifier, PRR-HyPred achieved accuracies of 83.4% and 95% in the first and second layers respectively. This is the first study that can predict and classify PRRs into specific families, and it can be a valuable tool for large-scale PRR prediction and classification, facilitating future studies.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Computer Science, Artificial Intelligence

MonkeyNet: A robust deep convolutional neural network for monkeypox disease detection and classification

Diponkor Bala, Md. Shamim Hossain, Mohammad Alamgir Hossain, Md. Ibrahim Abdullah, Md. Mizanur Rahman, Balachandran Manavalan, Naijie Gu, Mohammad S. Islam, Zhangjin Huang

Summary: The monkeypox virus poses a new pandemic threat. However, there is currently no reliable monkeypox database available for training and testing deep learning models. The MSID dataset has been developed for this purpose, providing a collection of monkeypox patient images for building confident deep learning models. The proposed MonkeyNet model can accurately identify monkeypox disease and assist doctors in making early diagnoses.

NEURAL NETWORKS (2023)

Review Biochemical Research Methods

A comprehensive revisit of the machine-learning tools developed for the identification of enhancers in the human genome

Le Thi Phan, Changmin Oh, Tao He, Balachandran Manavalan

Summary: Enhancers are non-coding DNA elements that enhance the transcription rate of specific genes. Computational platforms have been developed to complement experimental methods in identifying enhancers. This review provides an overview of machine learning-based prediction methods and databases for enhancer identification and discusses the advantages and drawbacks of these methods, as well as guidelines for developing more efficient enhancer predictors.

PROTEOMICS (2023)

Editorial Material Medicine, Research & Experimental

How well does a data-driven prediction method distinguish dihydrouridine from tRNA and mRNA?

Shaherin Basith, Balachandran Manavalan

MOLECULAR THERAPY-NUCLEIC ACIDS (2023)

Article Toxicology

Reduced lysosomal activity and increased amyloid beta accumulation in silica-coated magnetic nanoparticles-treated microglia

Tae Hwan Shin, Gwang Lee

Summary: Nanoparticles have been widely used in neurological research, but their potential toxicity remains a concern. This study investigated the effects of silica-coated magnetic nanoparticles on BV2 microglial cells and found that the nanoparticles induced amyloid beta accumulation and changes in lysosomal function. By employing triple-omics analysis, it was revealed that the nanoparticles caused a reduction in proteasome activity and lysosomal swelling. However, co-treatment with glutathione and citrate alleviated these effects.

ARCHIVES OF TOXICOLOGY (2023)

Article Computer Science, Artificial Intelligence

Hybrid data augmentation and deep attention-based dilated convolutional-recurrent neural networks for speech emotion recognition

Nhat Truong Pham, Duc Ngoc Minh Dang, Ngoc Duy Nguyen, Thanh Thi Nguyen, Hai Nguyen, Balachandran Manavalan, Chee Peng Lim, Sy Dzung Nguyen

Summary: This paper proposes a deep learning framework for speech emotion recognition, which combines a hybrid data augmentation method and deep attention-based dilated convolutional-recurrent neural networks. The framework is able to extract high-level representations from three-dimensional log Mel spectrogram features. Experimental results show that the proposed framework outperforms other state-of-the-art methods on the EmoDB and ERC datasets.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Physics, Multidisciplinary

On the classical reaction rate and the first-time problems of Brownian motion

Aihua Zhang, Sun Choi

Summary: We have developed efficient techniques to solve the first-time problems of Brownian motion. Using a time-scale separation of recrossings, we have shown that Eyring's transmission coefficient (kappa) equals the one (kappa V) corresponding to an absorbing boundary consistent with the transition state theory, which is greater than the one (kappa K) derived by Kramers. We have also proposed methods for reaction rate determination by analyzing short-time trajectories from the barrier maximum, and discussed the relation to the reactive flux method and the significance of reaction coordinates.

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS (2023)

Article Oncology

AMPK-HIF-1α signaling enhances glucose-derived de novo serine biosynthesis to promote glioblastoma growth

Hye Jin Yun, Min Li, Dong Guo, So Mi Jeon, Su Hwan Park, Je Sun Lim, Su Bin Lee, Rui Liu, Linyong Du, Seok-Ho Kim, Tae Hwan Shin, Seong-il Eyun, Yun-Yong Park, Zhimin Lu, Jong-Ho Lee

Summary: This study reveals that enhanced glucose-derived de novo serine biosynthesis is a critical metabolic feature of GBM cells under metabolic stress, and highlights the potential to target SSP for treating human GBM.

JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH (2023)

Article Biology

Unveiling local and global conformational changes and allosteric communications in SOD1 systems using molecular dynamics simulation and network analyses

Shaherin Basith, Balachandran Manavalan, Gwang Lee

Summary: This study combined microsecond-scale unbiased molecular dynamics simulation with network analysis to elucidate the local and global conformational changes and allosteric communications in SOD1 systems. Structural analyses revealed significant variations in catalytic sites and stability due to unmetallated SOD1 systems and cysteine mutations. Dynamic motion analysis showed balanced atomic displacement and highly correlated motions in the Holo system.

COMPUTERS IN BIOLOGY AND MEDICINE (2024)

暂无数据