4.7 Article

Structure-Aware Multimodal Deep Learning for Drug-Protein Interaction Prediction

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING
Volume 62, Issue 5, Pages 1308-1317

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jcim.2c00060

Keywords

-

Funding

  1. National Key R&D Program of China [2020YFB0204803]
  2. National Natural Science Foundation of China [61772566]
  3. Guangdong Key Field RD Plan [2019B020228001, 2018B010109006]
  4. Guangzhou ST Research Plan [202007030010]

Ask authors/readers for more resources

In this study, we propose a structure-aware multimodal deep DPI prediction model (STAMP-DPI), which accurately predicts drug-protein interactions by training on a carefully curated industry-scale benchmark dataset. The model combines the feature representations of molecules and proteins, effectively capturing the interaction features between them using graph neural networks and pretrained embeddings. Experimental results demonstrate that STAMP-DPI outperforms existing methods on multiple datasets and has interpretability.
Identifying drug-protein interactions (DPIs) is crucial in drug discovery, and a number of machine learning methods have been developed to predict DPIs. Existing methods usually use unrealistic data sets with hidden bias, which will limit the accuracy of virtual screening methods. Meanwhile, most DPI prediction methods pay more attention to molecular representation but lack effective research on protein representation and high-level associations between different instances. To this end, we present the novel structure-aware multimodal deep DPI prediction model, STAMP-DPI, which was trained on a curated industry-scale benchmark data set. We built a high-quality benchmark data set named GalaxyDB for DPI prediction. This industry-scale data set along with an unbiased training procedure resulted in a more robust benchmark study. For informative protein representation, we constructed a structure-aware graph neural network method from the protein sequence by combining predicted contact maps and graph neural networks. Through further integration of structure-based representation and high-level pretrained embeddings for molecules and proteins, our model effectively captures the feature representation of the interactions between them. As a result, STAMP-DPI outperformed state-of-the-art DPI prediction methods by decreasing 7.00% mean square error (MSE) in the Davis data set and improving 8.89% area under the curve (AUC) in the GalaxyDB data set. Moreover, our model is an interpretable model with the transformer-based interaction mechanism, which can accurately reveal the binding sites between molecules and proteins.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available