4.7 Article

StackIL6: a stacking ensemble model for improving the prediction of IL-6 inducing peptides

期刊

BRIEFINGS IN BIOINFORMATICS
卷 22, 期 6, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab172

关键词

interleukin 6; IL-6; bioinformatics; sequence analysis; machine learning; ensemble learning

资金

  1. Chiang Mai University, Mahidol University [MRG6180226]
  2. TRF Research Career Development Grant [RSA6280075]
  3. National Research Foundation of Korea (NRF) - Korean government (MSIT) [2021R1A2C1014338, 2018R1D1A1B07049572]
  4. National Research Foundation of Korea [2018R1D1A1B07049572, 2021R1A2C1014338] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

向作者/读者索取更多资源

IL-6 release is stimulated by antigenic peptides and immune cells, making IL-6 inducing peptides useful not only as diagnostic biomarkers, but also as inhibitors for aggressive immune responses. A novel stacking ensemble model, StackIL6, was developed using twelve feature descriptors and five machine learning algorithms, showing better performance than existing methods in identifying IL-6 inducing peptides. Accessible through a web server, StackIL6 has the potential to aid in the rapid screening of promising peptides for diagnostic and immunotherapeutic applications.
The release of interleukin (IL)-6 is stimulated by antigenic peptides from pathogens as well as by immune cells for activating aggressive inflammation. IL-6 inducing peptides are derived from pathogens and can be used as diagnostic biomarkers for predicting various stages of disease severity as well as being used as IL-6 inhibitors for the suppression of aggressive multi-signaling immune responses. Thus, the accurate identification of IL-6 inducing peptides is of great importance for investigating their mechanism of action as well as for developing diagnostic and immunotherapeutic applications. This study proposes a novel stacking ensemble model (termed StackIL6) for accurately identifying IL-6 inducing peptides. More specifically, StackIL6 was constructed from twelve different feature descriptors derived from three major groups of features (composition-based features, composition-transition-distribution-based features and physicochemical properties-based features) and five popular machine learning algorithms (extremely randomized trees, logistic regression, multi-layer perceptron, support vector machine and random forest). To enhance the utility of baseline models, they were effectively and systematically integrated through a stacking strategy to build the final meta-based model. Extensive benchmarking experiments demonstrated that StackIL6 could achieve significantly better performance than the existing method (IL6PRED) and outperformed its constituent baseline models on both training and independent test datasets, which thereby support its excellent discrimination and generalization abilities. To facilitate easy access to the StackIL6 model, it was established as a freely available web server accessible at http://camt.pythonanywhere.com/StackIL6. It is anticipated that StackIL6 can help to facilitate rapid screening of promising IL-6 inducing peptides for the development of diagnostic and immunotherapeutic applications in the future.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

SiameseCPP: a sequence-based Siamese network to predict cell -penetrating peptides by contrastive learning

Xin Zhang, Lesong Wei, Xiucai Ye, Kai Zhang, Saisai Teng, Zhongshen Li, Junru Jin, Minjae Kim, Tetsuya Sakurai, Lizhen Cui, Balachandran Manavalan, Leyi Wei

Summary: A novel deep learning framework SiameseCPP is proposed for automated prediction of cell-penetrating peptides (CPPs). SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network comprising a transformer and gated recurrent units. Comprehensive experiments demonstrate that SiameseCPP outperforms existing baseline models for CPP prediction and exhibits satisfactory generalization ability on other functional peptide datasets.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biology

PSRTTCA: A new approach for improving the prediction and characterization of tumor T cell antigens using propensity score representation learning

Phasit Charoenkwan, Chonlatip Pipattanaboon, Chanin Nantasenamat, Md Mehedi Hasan, Mohammad Ali Moni, Pietro Lio, Watshara Shoombuatong

Summary: Despite existing cancer therapies, the development of new and effective treatments is necessary to address the ongoing cancer recurrence and new cases. This study proposes a new machine learning-based approach, PSRTTCA, for improving the identification and characterization of tumor T cell antigens (TTCAs) based on their primary sequences.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biochemical Research Methods

VirPipe: an easy-to-use and customizable pipeline for detecting viral genomes from Nanopore sequencing

Kijin Kim, Kyungmin Park, Seonghyeon Lee, Seung-Hwan Baek, Tae-Hun Lim, Jongwoo Kim, Balachandran Manavalan, Jin-Won Song, Won-Keun Kim

Summary: VirPipe is a new pipeline for detecting viral genomes from Nanopore or Illumina sequencing, with streamlined installation and customization.

BIOINFORMATICS (2023)

Article Biology

Computational prediction of protein folding rate using structural parameters and network centrality measures

Saraswathy Nithiyanandam, Vinoth Kumar Sangaraju, Balachandran Manavalan, Gwang Lee

Summary: Protein folding is a complex process where a polymer of amino acids transitions from an unfolded state to a unique three-dimensional structure. Previous studies have identified structural parameters and examined their relationship with protein folding rate, but these parameters are only applicable to a limited set of proteins. Machine learning models have been proposed, but they fail to explain plausible folding mechanisms. In this study, ten different machine learning algorithms were evaluated using various structural parameters and network centrality measures, with support vector machine showing the best predictive capability.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biology

PSRQSP: An effective approach for the interpretable prediction of quorum sensing peptide using propensity score representation learning

Phasit Charoenkwan, Pramote Chumnanpuen, Nalini Schaduangrat, Changmin Oh, Balachandran Manavalan, Watshara Shoombuatong

Summary: In this study, a novel computational approach called PSRQSP was developed to improve the prediction and analysis of QSPs. Experimental results showed that PSRQSP outperformed conventional methods in identifying QSPs and demonstrated its predictive capability and effectiveness. PSRQSP also constructed an easy-to-use web server for accelerating the discovery of potential QSPs for drug development.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biochemistry & Molecular Biology

Pretoria: An effective computational approach for accurate and high-throughput identification of CD8+t-cell epitopes of eukaryotic pathogens

Phasit Charoenkwan, Nalini Schaduangrat, Nhat Truong Pham, Balachandran Manavalan, Watshara Shoombuatong

Summary: Proposed the first stack-based approach, Pretoria, for accurate and large-scale identification of CD8+ T-cell epitopes (TCEs) of eukaryotic pathogens. Constructed a pool of 144 different machine learning (ML)-based classifiers based on 12 popular ML algorithms and used feature selection method to determine important ML classifiers for building the stacked model. Experimental results demonstrated that Pretoria outperformed several conventional ML classifiers and the existing method, with an accuracy of 0.866, MCC of 0.732, and AUC of 0.921 in the independent test.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Biochemistry & Molecular Biology

PRR-HyPred: A two-layer hybrid framework to predict pattern recognition receptors and their families by employing sequence encoded optimal features

Ahmad Firoz, Adeel Malik, Hani Mohammed Ali, Yusuf Akhter, Balachandran Manavalan, Chang-Bae Kim

Summary: In this study, a new two-layer hybrid framework called PRR-HyPred was constructed to simultaneously predict and classify PRRs. Using support vector machine and random forest-based classifier, PRR-HyPred achieved accuracies of 83.4% and 95% in the first and second layers respectively. This is the first study that can predict and classify PRRs into specific families, and it can be a valuable tool for large-scale PRR prediction and classification, facilitating future studies.

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES (2023)

Article Biochemistry & Molecular Biology

Exploring the Chemical Space of CYP17A1 Inhibitors Using Cheminformatics and Machine Learning

Tianshi Yu, Tianyang Huang, Leiye Yu, Chanin Nantasenamat, Nuttapat Anuwongcharoen, Theeraphon Piacham, Ruobing Ren, Ying-Chih Chiang

Summary: Researchers studied Cytochrome P450 17A1 (CYP17A1), a key enzyme in steroidogenesis, and its potential as a druggable target for anti-cancer molecule development. They used cheminformatic analyses and quantitative structure-activity relationship (QSAR) modeling on a dataset of CYP17A1 inhibitors. Different models were built for steroidal and nonsteroidal inhibitors, achieving good accuracy. The findings provide valuable insights for further drug discovery efforts targeting CYP17A1 inhibitors.

MOLECULES (2023)

Article Computer Science, Artificial Intelligence

MonkeyNet: A robust deep convolutional neural network for monkeypox disease detection and classification

Diponkor Bala, Md. Shamim Hossain, Mohammad Alamgir Hossain, Md. Ibrahim Abdullah, Md. Mizanur Rahman, Balachandran Manavalan, Naijie Gu, Mohammad S. Islam, Zhangjin Huang

Summary: The monkeypox virus poses a new pandemic threat. However, there is currently no reliable monkeypox database available for training and testing deep learning models. The MSID dataset has been developed for this purpose, providing a collection of monkeypox patient images for building confident deep learning models. The proposed MonkeyNet model can accurately identify monkeypox disease and assist doctors in making early diagnoses.

NEURAL NETWORKS (2023)

Article Biology

Identification of SH2 domain-containing proteins and motifs prediction by a deep learning method

Duanzhi Wu, Xin Fang, Kai Luan, Qijin Xu, Shiqi Lin, Shiying Sun, Jiaying Yang, Bingying Dong, Balachandran Manavalan, Zhijun Liao

Summary: In this study, SH2 domain-containing proteins and non-SH2 domain-containing proteins were successfully identified using deep learning technology. The best performing 288-dimensional features were obtained. Additionally, a new motif, YKIR, in the SH2 domain was discovered and its function in signal transduction was analyzed.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Computer Science, Artificial Intelligence

Hybrid data augmentation and deep attention-based dilated convolutional-recurrent neural networks for speech emotion recognition

Nhat Truong Pham, Duc Ngoc Minh Dang, Ngoc Duy Nguyen, Thanh Thi Nguyen, Hai Nguyen, Balachandran Manavalan, Chee Peng Lim, Sy Dzung Nguyen

Summary: This paper proposes a deep learning framework for speech emotion recognition, which combines a hybrid data augmentation method and deep attention-based dilated convolutional-recurrent neural networks. The framework is able to extract high-level representations from three-dimensional log Mel spectrogram features. Experimental results show that the proposed framework outperforms other state-of-the-art methods on the EmoDB and ERC datasets.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Biology

Unveiling local and global conformational changes and allosteric communications in SOD1 systems using molecular dynamics simulation and network analyses

Shaherin Basith, Balachandran Manavalan, Gwang Lee

Summary: This study combined microsecond-scale unbiased molecular dynamics simulation with network analysis to elucidate the local and global conformational changes and allosteric communications in SOD1 systems. Structural analyses revealed significant variations in catalytic sites and stability due to unmetallated SOD1 systems and cysteine mutations. Dynamic motion analysis showed balanced atomic displacement and highly correlated motions in the Holo system.

COMPUTERS IN BIOLOGY AND MEDICINE (2024)

Article Multidisciplinary Sciences

TROLLOPE: A novel sequence-based stacked approach for the accelerated discovery of linear T-cell epitopes of hepatitis C virus

Phasit Charoenkwan, Sajee Waramit, Pramote Chumnanpuen, Nalini Schaduangrat, Watshara Shoombuatong

Summary: HCV infection causes chronic liver diseases, and there is no effective vaccine available. This study proposes a novel approach called TROLLOPE to accurately identify TCE-HCVs from sequence information, with superior predictive performance.

PLOS ONE (2023)

暂无数据