4.8 Article

Novel approach for selecting the best predictor for identifying the binding sites in DNA binding proteins

期刊

NUCLEIC ACIDS RESEARCH
卷 41, 期 16, 页码 7606-7614

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkt544

关键词

-

资金

  1. DST grant, Government of India [SR/SO/BB-0036/2011]
  2. ICGEB
  3. Department of Science and Technology, Government of India
  4. Oxford University Press

向作者/读者索取更多资源

Protein-DNA complexes play vital roles in many cellular processes by the interactions of amino acids with DNA. Several computational methods have been developed for predicting the interacting residues in DNA-binding proteins using sequence and/or structural information. These methods showed different levels of accuracies, which may depend on the choice of data sets used in training, the feature sets selected for developing a predictive model, the ability of the models to capture information useful for prediction or a combination of these factors. In many cases, different methods are likely to produce similar results, whereas in others, the predictors may return contradictory predictions. In this situation, a priori estimates of prediction performance applicable to the system being investigated would be helpful for biologists to choose the best method for designing their experiments. In this work, we have constructed unbiased, stringent and diverse data sets for DNA-binding proteins based on various biologically relevant considerations: (i) seven structural classes, (ii) 86 folds, (iii) 106 superfamilies, (iv) 194 families, (v) 15 binding motifs, (vi) single/double-stranded DNA, (vii) DNA conformation (A, B, Z, etc.), (viii) three functions and (ix) disordered regions. These data sets were culled as non-redundant with sequence identities of 25 and 40% and used to evaluate the performance of 11 different methods in which online services or standalone programs are available. We observed that the best performing methods for each of the data sets showed significant biases toward the data sets selected for their benchmark. Our analysis revealed important data set features, which could be used to estimate these context-specific biases and hence suggest the best method to be used for a given problem. We have developed a web server, which considers these features on demand and displays the best method that the investigator should use. The web server is freely available at http://www.biotech.iitm.ac.in/DNA-protein/. Further, we have grouped the methods based on their complexity and analyzed the performance. The information gained in this work could be effectively used to select the best method for designing experiments.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Pharmacology & Pharmacy

Mupirocin-Loaded Chitosan Microspheres Embedded in Piper betle Extract Containing Collagen Scaffold Accelerate Wound Healing Activity

Mansi Budhiraja, Sobiya Zafar, Sohail Akhter, Majed Alrobaian, Md Abdur Rashid, Md. Abul Barkat, Sarwar Beg, Farhan J. Ahmad

Summary: This study developed a combinational drug delivery system by embedding mupirocin-loaded chitosan microspheres in a collagen scaffold containing Piper betle extract, aiming to improve wound healing. The results showed that this method effectively promoted wound healing and exhibited good antibacterial effects.

AAPS PHARMSCITECH (2022)

Article Food Science & Technology

Gold Nanoparticle-based Sensors in Food Safety Applications

Sarushi Rastogi, Vinita Kumari, Vasudha Sharma, F. J. Ahmad

Summary: The use of gold nanoparticles (AuNP) in food safety is crucial due to their unique properties. Nanosensors are emerging as key technologies to detect various contaminants in food industries. Research is ongoing in the food industry to explore the applications of AuNP-based nanosensors and nanobiosensors.

FOOD ANALYTICAL METHODS (2022)

Article Immunology

High-Throughput B Cell Epitope Determination by Next-Generation Sequencing

Lauren M. Walker, Andrea R. Shiakolas, Rohit Venkat, Zhaojing Ariel Liu, Steven Wall, Nagarajan Raju, Kelsey A. Pilewski, Ian Setliff, Amyn A. Murji, Rebecca Gillespie, Nigel A. Makoah, Masaru Kanekiyo, Mark Connors, Lynn Morris, Ivelin S. Georgiev

Summary: The development of novel technologies for discovering human monoclonal antibodies has been extremely valuable in combating infectious diseases. LIBRA-seq with epitope mapping is a next-generation sequencing technology that can determine residue-level epitopes for thousands of single B cells simultaneously, making it an efficient tool for high-throughput identification of antibodies against specific antigen epitopes.

FRONTIERS IN IMMUNOLOGY (2022)

Article Biochemical Research Methods

Ab-CoV: a curated database for binding affinity and neutralization profiles of coronavirus-related antibodies

Puneet Rawat, Divya Sharma, R. Prabakaran, Fathima Ridha, Mugdha Mohkhedkar, Vani Janakiraman, M. Michael Gromiha

Summary: Ab-CoV is a database containing manually curated experimental interaction profiles of 1780 coronavirus-related neutralizing antibodies. It provides comprehensive data including IC50, EC50, and K-D, as well as predicted changes in stability and affinity of point mutations of interface residues.

BIOINFORMATICS (2022)

Article Cell Biology

Functional HIV-1/HCV cross-reactive antibodies isolated from a chronically co-infected donor

Kelsey A. Pilewski, Steven Wall, Simone I. Richardson, Nelia P. Manamela, Kaitlyn Clark, Tandile Hermanus, Elad Binshtein, Rohit Venkat, Giuseppe A. Sautto, Kevin J. Kramer, Andrea R. Shiakolas, Ian Setliff, Jordan Salas, Rutendo E. Mapengo, Naveen Suryadevara, John R. Brannon, Connor J. Beebout, Rob Parks, Nagarajan Raju, Nicole Frumento, Lauren M. Walker, Emilee Friedman Fechter, Juliana S. Qin, Amyn A. Murji, Katarzyna Janowska, Bhishem Thakur, Jared Lindenberger, Aaron J. May, Xiao Huang, Salam Sammour, Priyamvada Acharya, Robert H. Carnahan, Ted M. Ross, Barton F. Haynes, Maria Hadjifrangiskou, James E. Crowe Jr, Justin R. Bailey, Spyros Kalams, Lynn Morris, Ivelin S. Georgiev

Summary: In a study of a chronically HIV-1/HCV co-infected individual, researchers identified five cross-reactive antibodies that show exceptional neutralization breadth and effector functions against both HIV-1 and HCV. One antibody also cross-reacts with influenza and coronaviruses, including SARS-CoV-2. The development of these antibodies is closely related to somatic hypermutation, providing potential directions for therapeutic and vaccine development against current and emerging infectious diseases. Chronic co-infection represents a complex immunological challenge that can provide insights into the fundamental rules of antibody-antigen specificity.

CELL REPORTS (2023)

Article Instruments & Instrumentation

Precision engineering designed phospholipid-tagged pamidronate complex functionalized SNEDDS for the treatment of postmenopausal osteoporosis

Pavitra Solanki, Mohd Danish Ansari, Mohd Iqbal Alam, Mohd Aqil, Farhan J. Ahmad, Yasmin Sultana

Summary: This study developed an orally effective nanoformulation of disodium pamidronate for the treatment of osteoporosis. Through rational design and optimization, a commercially potential self nano-emulsifying drug delivery system (SNEDDS) was developed, which showed improved oral bioavailability and enhanced anti-osteoporotic activity. The study provided significant achievements in the treatment of postmenopausal osteoporosis and may lead to the use of nanotherapeutic-driven emerging biodegradable carriers-based drug delivery.

DRUG DELIVERY AND TRANSLATIONAL RESEARCH (2023)

Letter Biotechnology & Applied Microbiology

Comment on 'Thermodynamic database supports deciphering protein-nucleic acid interactions'

M. Michael Gromiha, Kannan Harini

Summary: Mei and colleagues introduced PNATDB, a thermodynamic database for protein-nucleic acid interactions with 12,635 experimentally determined parameters. They claimed that extracting data from existing databases is challenging. However, they did not discuss ProNAB, which contains over 20,000 experimental data points for binding affinities of protein-nucleic acid complexes and other information.

TRENDS IN BIOTECHNOLOGY (2023)

Article Genetics & Heredity

Understanding Drug Resistance of Wild-Type and L38HL Insertion Mutant of HIV-1 C Protease to Saquinavir

Sankaran Venkatachalam, Nisha Murlidharan, Sowmya R. Krishnan, C. Ramakrishnan, Mpho Setshedi, Ramesh Pandian, Debmalya Barh, Sandeep Tiwari, Vasco Azevedo, Yasien Sayed, M. Michael Gromiha

Summary: AIDS is a challenging infectious disease with a need for understanding drug resistance mechanisms. A new double-insertion mutation (L38HL) in HIV subtype C protease was investigated for its potential in inducing drug resistance towards the protease inhibitor Saquinavir (SQV). Computational techniques revealed that the L38HL mutation increased flexibility in certain regions and decreased binding affinity of SQV compared to wild-type. The mutation also resulted in a wide opening at the binding site and altered flap dynamics, leading to decreased interactions with the binding site and a potential drug resistance phenotype.
Article Chemistry, Medicinal

AutoPLP: A Padlock Probe Design Pipeline for Zoonotic Pathogens

Sowmya Ramaswamy Krishnan, Ruben R. G. Soares, Narayanan Madaboosi, M. Michael Gromiha

Summary: The emergence of new zoonotic infections among humans has increased the burden on global healthcare systems to control their spread. To address this, a novel and integrated PLP design pipeline called AutoPLP has been developed, which can automate the probe design process for a diverse pathogen panel of interest.

ACS INFECTIOUS DISEASES (2023)

Article Biochemical Research Methods

TMKit: a Python interface for computational analysis of transmembrane proteins

Jianfeng Sun, Arulsamy Kulandaisamy, Jinlong Ru, M. Michael Gromiha, Adam P. Cribbs

Summary: TMKit is an open-source Python programming interface specifically designed for processing transmembrane protein data. It includes tools for database wrangling, feature engineering, and protein-protein interaction visualization. Additionally, it offers the high-performance computing library seqNetRR for fast construction of residue connections and allocation of correlation matrix-based features. TMKit serves as a useful tool for researchers studying transmembrane protein sequences and structures.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemistry & Molecular Biology

Analysis and Prediction of Pathogen Nucleic Acid Specificity for Toll-like Receptors in Vertebrates

Anuja Jain, Tina Begum, Shandar Ahmad

Summary: Identifying the molecular features of host Toll-like receptors (TLRs), which are responsible for sensing pathogen nucleic acids, is important for understanding host defense mechanisms. We found that these features directly correlate with the strand specificity of the pathogen nucleic acids, but cannot fully explain the selectivity of pathogenic molecular patterns. Using machine learning, we developed a model that accurately predicts the strand specificity of TLRs based on protein-derived features.

JOURNAL OF MOLECULAR BIOLOGY (2023)

Article Genetics & Heredity

Investigating Neuron Degeneration in Huntington's Disease Using RNA-Seq Based Transcriptome Study

Nela Pragathi Sneha, S. Akila Parvathy Dharshini, Y. -H Taguchi, M. Michael Gromiha

Summary: In this study, the relationship between genetic variants and differentially expressed genes/transcripts in the BA4 region of Huntington's disease patients was investigated. The study identified variants that regulate gene expression and highlighted variants affecting miRNA and its targets. Co-expression network analysis revealed the role of novel genes, while function interaction network analysis showed the importance of genes involved in vesicle-mediated transport. The study also emphasized the crucial role of genes expressed in immune cells in reducing neuron death in Huntington's disease.
Proceedings Paper Computer Science, Artificial Intelligence

Sample Size Estimation for Effective Modelling of Classification Problems in Machine Learning

Neha Vinayak, Shandar Ahmad

Summary: High quality and numerous data are essential for machine learning models. This study examines the issue using low-dimensional data sets and random forest as the classification model. The authors provide an initial estimate for the optimal data size requirement and observe that ML models can still perform well even with some class label errors.

ADVANCED NETWORK TECHNOLOGIES AND INTELLIGENT COMPUTING, ANTIC 2022, PT II (2023)

Article Computer Science, Information Systems

On Performance and Calibration of Natural Gradient Langevin Dynamics

Hanif Amal Robbani, Alhadi Bustamam, Risman Adnan, Shandar Ahmad

Summary: This paper proposes an EKFAC preconditioned SGLD algorithm (EKSGLD), which improves the optimization process and combines the advantages of second-order optimization and the approximate Bayesian method. Experimental results show that EKSGLD outperforms existing preconditioning methods in terms of predictive accuracy and calibration.

IEEE ACCESS (2023)

Article Biochemistry & Molecular Biology

MPA-Pred: A machine learning approach for predicting the binding affinity of membrane protein-protein complexes

Fathima Ridha, M. Michael Gromiha

Summary: Membrane protein-protein interactions are crucial for cellular functions. This study collected experimental data of membrane protein-protein complexes and derived features to understand the factors influencing binding affinity. A machine learning method, MPA-Pred, was developed to predict the binding affinity and showed high accuracy in the prediction.

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS (2023)

暂无数据