4.7 Article

Identification of active molecules against Mycobacterium tuberculosis through machine learning

期刊

BRIEFINGS IN BIOINFORMATICS
卷 22, 期 5, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab068

关键词

Mycobacterium tuberculosis; machine learning; extreme gradient boosting; deep learning; model fusion

资金

  1. Key R&D Program of Zhejiang Province [2020C03010]
  2. National Natural Science Foundation of China [21,575,128, 81,773,632]
  3. Natural Science Foundation of Zhejiang Province [LZ19H300001]

向作者/读者索取更多资源

This study developed classification models using machine learning algorithms to distinguish Mtb inhibitors from noninhibitors, with the XGBoost model showing the best prediction performance. Two consensus strategies were employed to integrate predictions from multiple models, resulting in the best predictions. The association between important descriptors and bioactivities of molecules was interpreted, and an online tool called ChemTB was developed for detecting potential Mtb inhibitors.
Tuberculosis (TB) is an infectious disease caused by Mycobacterium tuberculosis (Mtb) and it has been one of the top 10 causes of death globally. Drug-resistant tuberculosis (XDR-TB), extensively resistant to the commonly used first-line drugs, has emerged as a major challenge to TB treatment. Hence, it is quite necessary to discover novel drug candidates for TB treatment. In this study, based on different types of molecular representations, four machine learning (ML) algorithms, including support vector machine, random forest (RF), extreme gradient boosting (XGBoost) and deep neural networks (DNN), were used to develop classification models to distinguish Mtb inhibitors from noninhibitors. The results demonstrate that the XGBoost model exhibits the best prediction performance. Then, two consensus strategies were employed to integrate the predictions from multiple models. The evaluation results illustrate that the consensus model by stacking the RF, XGBoost and DNN predictions offers the best predictions with area under the receiver operating characteristic curve of 0.842 and 0.942 for the 10-fold cross-validated training set and external test set, respectively. Besides, the association between the important descriptors and the bioactivities of molecules was interpreted by using the Shapley additive explanations method. Finally, an online webserver called ChemTB (http://cadd.zju.edu.cn/chemtb/) was developed, and it offers a freely available computational tool to detect potential Mtb inhibitors.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemistry & Molecular Biology

PROTAC-DB 2.0: an updated database of PROTACs

Gaoqi Weng, Xuanyan Cai, Dongsheng Cao, Hongyan Du, Chao Shen, Yafeng Deng, Qiaojun He, Bo Yang, Dan Li, Tingjun Hou

Summary: PROTAC-DB 2.0 is an updated online database that contains structural and experimental data about PROTACs. This second version expands the number of PROTACs to 3270 and provides additional information to aid in the understanding and design of PROTACs.

NUCLEIC ACIDS RESEARCH (2023)

Review Chemistry, Multidisciplinary

Recent advances in computational studies on voltage-gated sodium channels: Drug design and mechanism studies

Gaoang Wang, Lei Xu, Haiyi Chen, Yifei Liu, Peichen Pan, Tingjun Hou

Summary: This article summarizes recent and representative studies on voltage-gated sodium channels (VGSCs/Na(v)s) from the perspective of computer-aided drug design (CADD) and molecular modeling. It covers the structural biology of VGSCs, virtual screening and drug design based on CADD, and functional studies using molecular modeling technologies. The article concludes the achievements in the field of VGSCs and discusses the shortcomings found in previous studies.

WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE (2023)

Article Biochemical Research Methods

Can molecular dynamics simulations improve predictions of protein-ligand binding affinity with machine learning?

Shukai Gu, Chao Shen, Jiahui Yu, Hong Zhao, Huanxiang Liu, Liwei Liu, Rong Sheng, Lei Xu, Zhe Wang, Tingjun Hou, Yu Kang

Summary: This study evaluated the impact of structural dynamic information on binding affinity prediction and found that the optimized molecular dynamics protocol improved the predictive performance for the TAF1-BD2 target with high structural flexibility, but not for the less flexible JAK1 and DDR1 targets.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Chemistry, Multidisciplinary

Novel thieno[2,3-b]quinoline-procaine hybrid molecules: A new class of allosteric SHP-1 activators evolved from PTP1B inhibitors

Lei Xu, Xuyang Xuyan, Minmin Minmi, Zhiji Wang, Chao Shee, Qianwen Mu, Bo Feng, Yechun Xu, Tingjun Hou, Lixin Gao, Haini Jiang, Jia Li, Yubo Zhou, Wenlong Wang

Summary: In this study, a new class of thieno[2,3-b]quinolineprocaine hybrid molecules were reported as allosteric activators of SHP-1. The representative hybrid compound 3b displayed SHP-1 activating effect with an EC50 of 5.48 ± 0.28 μmol/L. Further investigations confirmed that 3b allosterically interacted with SHP-1, switched it from close to open conformation, blocked SHP-1/p STAT3 pathway, induced apoptosis and inhibited ABC-DLBCL cell proliferation.

CHINESE CHEMICAL LETTERS (2023)

Article Chemistry, Medicinal

Structural Analysis and Prediction of Hematotoxicity Using Deep Learning Approaches

Teng-Zhi Long, Shao-Hua Shi, Shao Liu, Ai-Ping Lu, Zhao-Qian Liu, Min Li, Ting-Jun Hou, Dong-Sheng Cao

Summary: This study constructed a high-quality dataset and established a series of classification models using machine learning algorithms to predict hematotoxicity. The best model based on Attentive FP showed excellent performance on both the validation and test sets. Additionally, the study utilized SHAP and atom heatmap methods to identify important features and structural fragments related to hematotoxicity, and employed MMPA and representative substructure derivation technique to further investigate the transformation principles and distinctive structural features of hematotoxic chemicals.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2023)

Article Chemistry, Physical

Sigmoid Accelerated Molecular Dynamics: An Efficient Enhanced Sampling Method for Biosystems

Yihao Zhao, Jintu Zhang, Haotian Zhang, Shukai Gu, Yafeng Deng, Yaoquan Tu, Tingjun Hou, Yu Kang

Summary: Inspired by GaMD, this work proposes a new accelerated molecular dynamics method called Sigmoid accelerated molecular dynamics (SaMD), which improves the balance between the highest acceleration and accurate reweighting by adding a Sigmoid boost potential. Compared with GaMD, SaMD extends the accessible time scale and improves computational efficiency, and it achieves better results in alanine dipeptide, chignolin folding, and protein-ligand binding tasks.

JOURNAL OF PHYSICAL CHEMISTRY LETTERS (2023)

Article Pharmacology & Pharmacy

MF-SuP-pKa: Multi-fidelity modeling with subgraph pooling mechanism for pKa prediction

Jialu Wu, Yue Wan, Zhenxing Wu, Shengyu Zhang, Dongsheng Cao, Chang-Yu Hsieh, Tingjun Hou

Summary: MF-SuP-pKa is a novel pKa prediction model that utilizes subgraph pooling, multi-fidelity learning, and data augmentation. The model captures the local and global environments around ionization sites for micro-pKa prediction using a knowledge-aware subgraph pooling strategy. By fitting low-fidelity data to high-fidelity data through transfer learning, MF-SuP-pKa achieves superior performance compared to state-of-the-art models with less high-fidelity training data.

ACTA PHARMACEUTICA SINICA B (2023)

Article Chemistry, Multidisciplinary

Uncovering the Kinetic Characteristics and Degradation Preference of PROTAC Systems with Advanced Theoretical Analyses

Rongfan Tang, Zhe Wang, Sutong Xiang, Lingling Wang, Yang Yu, Qinghua Wang, Qirui Deng, Tingjun Hou, Huiyong Sun

Summary: Proteolysis-targeting chimeras (PROTACs) selectively degrade target proteins and are an attractive technology in drug discovery. In this study, the kinetic mechanism of PROTAC MZ1 targeting the bromodomain (BD) of BET protein and von Hippel-Lindau E3 ligase (VHL) was characterized and analyzed using simulations and free energy calculations. The results showed that MZ1 prefers to bind with E3 ligase in the formation of the target-PROTAC-E3 ligase ternary complex. The binding characteristics revealed in this study may accelerate the rational design of PROTACs with higher degradation efficiency.

JACS AU (2023)

Article Biochemical Research Methods

ML-PLIC: a web platform for characterizing protein-ligand interactions and developing machine learning-based scoring functions

Xujun Zhang, Chao Shen, Tianyue Wang, Yafeng Deng, Yu Kang, Dan Li, Tingjun Hou, Peichen Pan

Summary: Cracking the code of protein-ligand interaction is crucial for drug design and discovery. The ML-based PLI capturer (ML-PLIC) is a web platform that automatically characterizes PLI and generates machine learning-based scoring functions to identify potential binders. It outperforms traditional docking tools and performs competitively with deep learning-based methods. ML-PLIC integrates physical and biological knowledge to design a structure-based virtual screening pipeline.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemical Research Methods

TransFoxMol: predicting molecular property with focused attention

Jian Gao, Zheyuan Shen, Yufeng Xie, Jialiang Lu, Yang Lu, Sikang Chen, Qingyu Bian, Yue Guo, Liteng Shen, Jian Wu, Binbin Zhou, Tingjun Hou, Qiaojun He, Jinxin Che, Xiaowu Dong

Summary: This study introduces a more elegant transformer-based framework, TransFoxMol, to improve the artificial intelligence's understanding of molecular structure-property relationships. Experimental results show that TransFoxMol achieves state-of-the-art performance and outperforms baseline models on small-scale datasets.

BRIEFINGS IN BIOINFORMATICS (2023)

Review Pharmacology & Pharmacy

Artificial intelligence methods in kinase target profiling: Advances and challenges

Shukai Gu, Huanxiang Liu, Liwei Liu, Tingjun Hou, Yu Kang

Summary: Kinases play a crucial role in cellular processes and accurate kinase-profiling prediction is vital for drug discovery. This review provides an overview of the latest advancements in machine learning and deep learning models for kinase profiling, discussing the challenges and future directions in this field.

DRUG DISCOVERY TODAY (2023)

Article Chemistry, Medicinal

Small-Molecule Conformer Generators: Evaluation of Traditional Methods and AI Models on High-Quality Data Sets

Zhe Wang, Haiyang Zhong, Jintu Zhang, Peichen Pan, Dong Wang, Huanxiang Liu, Xiaojun Yao, Tingjun Hou, Yu Kang

Summary: This study systematically evaluates the performance of traditional methods and AI models in small-molecule conformer generation. The results show that traditional methods outperform AI models in reproducing bioactive conformations, while an AI model has an advantage in generating low-energy conformations.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2023)

Article Chemistry, Multidisciplinary

TB-IECS: an accurate machine learning-based scoring function for virtual screening

Xujun Zhang, Chao Shen, Dejun Jiang, Jintu Zhang, Qing Ye, Lei Xu, Tingjun Hou, Peichen Pan, Yu Kang

Summary: Machine learning-based scoring functions (MLSFs) have the potential to improve virtual screening capabilities compared to classical scoring functions (SFs). However, the high computational cost and limited descriptors used in MLSFs and protein-ligand interaction characterization may impact accuracy and efficiency. In this study, a new SF called TB-IECS was proposed, combining energy terms from Smina and NNScore version 2 using the XGBoost algorithm. TB-IECS outperformed classical SFs and balanced efficiency and accuracy for practical virtual screening.

JOURNAL OF CHEMINFORMATICS (2023)

Article Chemistry, Multidisciplinary

Recent advances in deep learning for retrosynthesis

Zipeng Zhong, Jie Song, Zunlei Feng, Tiantao Liu, Lingxiang Jia, Shaolun Yao, Tingjun Hou, Mingli Song

Summary: Retrosynthesis is the cornerstone of organic chemistry, and recent advances in deep learning and artificial intelligence have revolutionized the field. This comprehensive review provides a taxonomy and evaluation of existing methods, as well as an introduction to popular databases and platforms for retrosynthesis.

WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE (2023)

Article Computer Science, Artificial Intelligence

ResGen is a pocket-aware 3D molecular generation model based on parallel multiscale modelling

Odin Zhang, Jintu Zhang, Jieyu Jin, Xujun Zhang, Renling Hu, Chao Shen, Hanqun Cao, Hongyan Du, Yu Kang, Yafeng Deng, Furui Liu, Guangyong Chen, Chang-Yu Hsieh, Tingjun Hou

Summary: This article introduces a three-dimensional molecular generative model called ResGen, which is conditioned on protein pockets and can design organic molecules inside a given target. The ResGen model has a higher success rate in generating novel molecules that bind more tightly to unseen targets than existing approaches. It also successfully generates drug-like molecules with lower binding energy and higher diversity than state-of-the-art methods in real-world scenarios.

NATURE MACHINE INTELLIGENCE (2023)

暂无数据