4.3 Article

An all-atom knowledge-based energy function for protein-DNA threading, docking decoy discrimination, and prediction of transcription-factor binding profiles

期刊

出版社

WILEY
DOI: 10.1002/prot.22384

关键词

transcription factor binding sites; statistical potential; protein-DNA docking

资金

  1. NIH [GM066049, GM085003]
  2. China Outstanding Youth Fund [20525416]
  3. China Scholarship Council

向作者/读者索取更多资源

How to make an accurate representation of protein-DNA interaction by an energy function is a long-standing unsolved problem in structural biology. Here, we modified a statistical potential based on the distance-scaled, finite ideal-gas reference state so that it is optimized for protein-DNA interactions. The changes include a volume-fraction correction to account for unmixable atom types in proteins and DNA in addition to the usage of a low-count correction, residue/base-specific atom types, and a shorter cutoff distance for protein-DNA interactions. The new statistical energy functions are tested in threading and docking decoy discriminations and prediction of protein-DNA binding affinities and transcription-factor binding profiles. The results indicate that new proposed energy functions are among the best in existing energy functions for protein-DNA interactions. The new energy functions are available as a web-server called DDNA 2.0 at http://sparks.informatics.iupui.edu. The server version was trained by the entire 212 protein-DNA complexes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Virology

Mendelian randomization suggests a potential causal effect of eosinophil count on influenza vaccination responsiveness

Hongwei Chen, Haoyang Zhang, Simin Wen, Xuehao Xiu, Danming You, Huiying Zhao, Dayan Wang, Yuedong Yang, Yuelong Shu

Summary: Currently, there is a lack of systematic exploration on the clinical factors influencing immune responses to influenza vaccines. The mechanism of low responsiveness to influenza vaccination (LRIV) is complex and not well understood. In this study, we combined our in-house genome-wide association studies (GWAS) analysis of LRIV with the GWAS summary of 10 blood-based biomarkers to investigate the genetics shared between LRIV and blood-based biomarkers using Mendelian randomization (MR). The results suggest a potential causal relationship between genetically instrumented LRIV and decreased eosinophil count.

JOURNAL OF MEDICAL VIROLOGY (2023)

Article Biochemical Research Methods

Identifying spatial domain by adapting transcriptomics with histology through contrastive learning

Yuansong Zeng, Rui Yin, Mai Luo, Jianing Chen, Zixiang Pan, Yutong Lu, Weijiang Yu, Yuedong Yang

Summary: Recent advances in spatial transcriptomics have allowed for gene expression measurement at cell/spot resolution, while retaining spatial information and histology images of the tissues. Accurately identifying the spatial domains of spots is crucial for downstream tasks in spatial transcriptomics analysis. In this study, a novel method called ConGI is proposed, which utilizes contrastive learning to accurately exploit spatial domains by combining gene expression with histopathological images. The method outperforms existing methods and the learned representations are useful for various downstream tasks.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemical Research Methods

Fast and accurate protein intrinsic disorder prediction by using a pretrained language model

Yidong Song, Qianmu Yuan, Sheng Chen, Ken Chen, Yaoqi Zhou, Yuedong Yang

Summary: Determining intrinsically disordered regions of proteins is crucial for understanding protein biological functions and associated diseases. This study proposes a fast and accurate protein disorder predictor, LMDisorder, which utilizes embedding generated by unsupervised pretrained language models as features. LMDisorder outperforms other single-sequence-based methods and compares favorably to another language-model-based technique in independent test sets. Additionally, LMDisorder shows equivalent or better performance than the state-of-the-art profile-based technique SPOT-Disorder2. The high computation efficiency of LMDisorder allows for proteome-scale analysis, revealing associations between proteins with high predicted disorder content and specific biological functions. The datasets, source codes, and trained model are available at https://github.com/biomed-AI/LMDisorder.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemical Research Methods

Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion

Qianmu Yuan, Junjie Xie, Jiancong Xie, Huiying Zhao, Yuedong Yang

Summary: Protein function prediction is crucial in bioinformatics and has implications for disease mechanism elucidation and drug target discovery. However, accurately predicting protein functions solely from sequences remains challenging. This study introduces SPROF-GO, a sequence-based alignment-free predictor that utilizes a pretrained language model to extract informative sequence embeddings and implements self-attention pooling to focus on important residues. SPROF-GO outperforms state-of-the-art approaches in precision-recall curves and demonstrates generalization capabilities.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemical Research Methods

A Drug Combination Prediction Framework Based on Graph Convolutional Network and Heterogeneous Information

Hegang Chen, Yuyin Lu, Yuedong Yang, Yanghui Rao

Summary: Combination therapy plays an important role in treating complex diseases, but the large number of possible combinations limits our ability to identify effective ones. This study introduces a new computational pipeline, DCMGCN, which integrates diverse drug-related information to predict novel drug combinations. The tests show that DCMGCN outperforms existing methods and may help to clarify the understanding of drug mechanisms.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

Article Biochemistry & Molecular Biology

Spatiotemporal regulation of cholangiocarcinoma growth and dissemination by peritumoral myofibroblasts in a Vcam1-dependent manner

Cheng Tian, Liyuan Li, Qingfei Pan, Beisi Xu, Yizhen Li, Li Fan, Anthony Brown, Michelle Morrison, Kaushik Dey, Jun J. Yang, Jiyang Yu, Evan S. Glazer, Liqin Zhu

Summary: Intrahepatic cholangiocarcinoma (iCCA) is characterized by highly desmoplastic stroma. Contact between tumor cells and peritumoral myofibroblasts (pMFs) initially suppresses tumor cell growth but promotes invasion and dissemination in the long term. Vascular cell adhesion molecule-1 (Vcam1) plays a significant role in this process by regulating epithelial-to-mesenchymal transition. Overall, this study reveals the spatiotemporal regulation of iCCA growth and dissemination by pMFs in a Vcam1-dependent manner.

ONCOGENE (2023)

Article Multidisciplinary Sciences

Proteasome inhibition targets the KMT2A transcriptional complex in acute lymphoblastic leukemia

Jennifer L. Kamens, Stephanie Nance, Cary Koss, Beisi Xu, Anitria Cotton, Jeannie W. Lam, Elizabeth A. R. Garfinkle, Pratima Nallagatla, Amelia M. R. Smith, Sharnise Mitchell, Jing Ma, Duane Currier, William C. Wright, Kanisha Kavdia, Vishwajeeth R. Pagala, Wonil Kim, LaShanale M. Wallace, Ji-Hoon Cho, Yiping Fan, Aman Seth, Nathaniel Twarog, John K. Choi, Esther A. Obeng, Mark E. Hatley, Monika L. Metzger, Hiroto Inaba, Sima Jeha, Jeffrey E. Rubnitz, Junmin Peng, Taosheng Chen, Anang A. Shelat, R. Kiplin Guy, Tanja A. Gruber

Summary: Proteasome inhibition is found to be effective in KMT2Ar infant acute lymphoblastic leukemia, leading to the depletion of histone modifications and downregulation of KMT2A gene expression signature. A cohort of relapsed/refractory KMT2Ar patients treated with this approach showed a high overall response rate. This innovative treatment approach is now being evaluated in a multi-institutional upfront trial for infants with newly diagnosed ALL.

NATURE COMMUNICATIONS (2023)

Article Engineering, Biomedical

ShockSurv: A machine learning model to accurately predict 28-day mortality for septic shock patients in the intensive care unit

Fudan Zheng, Luhao Wang, Yuxian Pang, Zhiguang Chen, Yutong Lu, Yuedong Yang, Jianfeng Wu

Summary: Septic shock has become the leading cause of morbidity and mortality in the ICU. However, currently there is no model to predict the mortality of septic shock patients. We aim to develop such a model.

BIOMEDICAL SIGNAL PROCESSING AND CONTROL (2023)

Article Neurosciences

Inferring the genetic relationship between brain imaging-derived phenotypes and risk of complex diseases by Mendelian randomization and genome-wide colocalization

Siying Lin, Haoyang Zhang, Mengling Qi, David N. Cooper, Yuedong Yang, Yuanhao Yang, Huiying Zhao

Summary: Observational studies consistently show that brain imaging-derived phenotypes (IDPs) are critical markers for the early diagnosis of brain disorders and cardiovascular diseases. However, the shared genetic landscape between brain IDPs and the risk of these diseases remains unclear, limiting the application of potential diagnostic techniques using brain IDPs.

NEUROIMAGE (2023)

Article Biochemistry & Molecular Biology

Characterizing RNA-binding ligands on structures, chemical information, binding affinity and drug-likeness

Cong Fan, Xin Wang, Tianze Ling, Yuedong Yang, Huiying Zhao

Summary: Recent studies suggest that RNAs have potential as drug targets, but progress in detecting RNA-ligand interactions is limited. To guide the discovery of RNA-binding ligands, it is necessary to comprehensively characterize them in terms of binding specificity, binding affinity, and drug-like properties. We established the RNALID database, which contains 358 validated RNA-ligand interactions. Comparisons with other databases show that the majority of ligands in RNALID are novel, and the analysis of ligand structure, binding affinity, and cheminformatic parameters reveals insights into the characteristics of different ligand types. Additionally, comparing RNALID ligands to FDA-approved drugs and ligands without bioactivity sheds light on their differences in chemical properties and drug-likeness.

RNA BIOLOGY (2023)

Article Multidisciplinary Sciences

SNIP1 and PRC2 coordinate cell fates of neural progenitors during brain development

Yurika Matsui, Mohamed Nadhir Djekidel, Katherine Lindsay, Parimal Samir, Nina Connolly, Gang Wu, Xiaoyang Yang, Yiping Fan, Beisi Xu, Jamy C. Peng

Summary: The study shows that SNIP1 is critical for the survival and differentiation of stem cells in the developing brain. It regulates PRC2 activities downstream of TGFb and NFkB, influencing cell fates. Understanding the role of SNIP1 in brain development can provide insights into cell survival and death during development.

NATURE COMMUNICATIONS (2023)

Article Computer Science, Artificial Intelligence

Subgraph-Aware Few-Shot Inductive Link Prediction Via Meta-Learning

Shuangjia Zheng, Sijie Mai, Ya Sun, Haifeng Hu, Yuedong Yang

Summary: Link prediction for knowledge graphs aims to predict missing connections between entities. Prevailing methods are limited to a transductive setting and hard to process unseen entities. The recently proposed subgraph-based models provide alternatives to predict links from the subgraph structure surrounding a candidate triplet.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2023)

Article Biochemical Research Methods

Identifying B-cell epitopes using AlphaFold2 predicted structures and pretrained language model

Yuansong Zeng, Zhuoyi Wei, Qianmu Yuan, Sheng Chen, Weijiang Yu, Yutong Lu, Jianzhao Gao, Yuedong Yang

Summary: Drawing on the breakthrough of AlphaFold2 in protein structure prediction, we propose a novel graph-based model, GraphBepi, for accurate B-cell epitope prediction. By utilizing the predicted structure from AlphaFold2, GraphBepi constructs the protein graph and captures both sequence and spatial information through edge-enhanced deep graph neural networks (EGNN) and bidirectional long short-term memory neural networks (BiLSTM). The combined representations are input into a multilayer perceptron to predict B-cell epitopes. Comprehensive tests demonstrate that GraphBepi outperforms state-of-the-art methods in terms of AUC and AUPR.

BIOINFORMATICS (2023)

Article Biochemistry & Molecular Biology

Biological informed graph neural network for tumor mutation burden prediction and immunotherapy-related pathway analysis in gastric cancer

Chuwei Liu, Arabella H. Wan, Heng Liang, Lei Sun, Jiarui Li, Ranran Yang, Qinghai Li, Ruibo Wu, Kunhua Hu, Yuedong Yang, Shirong Cai, Guohui Wan, Weiling He

Summary: Tumor mutation burden (TMB) is an important biomarker for assessing the efficacy of cancer immunotherapy, but its correlation with immune checkpoint inhibitors (ICIs) responsiveness varies among different cancer types. This study explores the relationship between TMB and multi-omics data in various cancer types and develops the PGLCN model to improve the interpretability and prediction accuracy of TMB. By integrating multi-omics data, the PGLCN model outperforms traditional machine learning methods in predicting TMB status and identifies potential combined biomarkers for TMB in gastric cancer.

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL (2023)

Article Radiology, Nuclear Medicine & Medical Imaging

Machine learning on MRI radiomic features: identification of molecular subtype alteration in breast cancer after neoadjuvant therapy

Hai-Qing Liu, Si-Ying Lin, Yi-Dong Song, Si-Yao Mai, Yue-Dong Yang, Kai Chen, Zhuo Wu, Hui-Ying Zhao

Summary: This study developed a machine learning model based on MRI to predict molecular subtype alterations in breast cancer after neoadjuvant therapy. The model showed favorable predictive efficacy in identifying molecular subtype alteration and could be a useful tool in clinical practice.

EUROPEAN RADIOLOGY (2023)

暂无数据