4.4 Article

Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices

Journal

JOURNAL OF THEORETICAL BIOLOGY
Volume 254, Issue 2, Pages 476-482

Publisher

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
DOI: 10.1016/j.jtbi.2008.06.003

Keywords

protein models; protein secondary structures; star graph; Python application

Funding

  1. the FCT (Portugal) [SFRH/BPD/24997/2005]
  2. University of Santiago de Compostela (Spain)
  3. FSE (Fondo Social Europeo)
  4. Fundação para a Ciência e a Tecnologia [SFRH/BPD/24997/2005] Funding Source: FCT

Ask authors/readers for more resources

The huge amount of new proteins that need a fast enzymatic activity characterization creates demands of protein QSAR theoretical models. The protein parameters that can be used for an enzyme/non-enzyme classification includes the simpler indices such as composition, sequence and connectivity, also called topological indices (TIs) and the computationally expensive 3D descriptors. A comparison of the 3D versus lower dimension indices has not been reported with respect to the power of discrimination of proteins according to enzyme action. A set of 966 proteins (enzymes and non-enzymes) whose structural characteristics are provided by PDB/DSSP files was analyzed with Python/Biopython scripts, STATISTICA and Weka. The list of indices includes, but it is not restricted to pure composition indices (residue fractions), DSSP secondary structure protein composition and 3D indices (surface and access). We also used mixed indices such as composition-sequence indices (Chou's pseudoamino acid compositions or coupling numbers), 31)-composition (surface fractions) and DSSP secondary structure amino acid composition/propensities (obtained with our Prot-2S Web too[). In addition, we extend and test for the first time several classic TIs for the Randic's protein sequence Star graphs using our Sequence to Star Graph (S2SG) Python application. All the indices were processed with general discriminant analysis models (GDA), neural networks (NN) and machine learning (ML) methods and the results are presented versus complexity, average of Shannon's information entropy (Sh) and data/ method type. This study compares for the first time all these classes of indices to assess the ratios between model accuracy and indices/model complexity in enzyme/non-enzyme discrimination. The use of different methods and complexity of data shows that one cannot establish a direct relation between the complexity and the accuracy of the model. (C) 2008 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Biochemistry & Molecular Biology

Synthesis, Pharmacological, and Biological Evaluation of 2-FuroylBased MIF-1 Peptidomimetics and the Development of a GeneralPurpose Model for Allosteric Modulators (ALLOPTML)

Ivo E. Sampaio-Dias, Jose E. Rodriguez-Borges, Victor Yanez-Perez, Sonia Arrasate, Javier Llorente, Jose M. Brea, Harbil Bediaga, Dolores Vina, Maria Isabel Loza, Olga Caamano, Xerardo Garcia-Mera, Humberto Gonzalez-Diaz

Summary: The synthesis and pharmacological evaluation of 2-furoyl-based Melanostatin (MIF-1) peptidomimetics as dopamine D-2 modulating agents were described in this work. Peptidomimetic 6a showed promising results without neurotoxicity at high concentrations, making it a potential lead compound for further development. Additionally, the ALLOPTML model, based on perturbation theory and machine learning, demonstrated high specificity, sensitivity, and accuracy in predicting the allosteric modulatory potential of molecular candidates.

ACS CHEMICAL NEUROSCIENCE (2021)

Article Chemistry, Medicinal

Predicting Metabolic Reaction Networks with Perturbation-Theory Machine Learning (PTML) Models

Karel Dieguez-Santana, Gerardo M. Casanola-Martin, James R. Green, Bakhtiyor Rasulev, Humberto Gonzalez-Diaz

Summary: This study developed a CPTML model for MRNs of multiple organisms using Combinatorial Perturbation Theory and Machine Learning techniques, and identified PTML models based on Bayesian network, Decision Tree, and Random Forest algorithms as the three best non-linear models with high accuracy.

CURRENT TOPICS IN MEDICINAL CHEMISTRY (2021)

Editorial Material Chemistry, Medicinal

New Experimental and Computational Tools for Drug Discovery: Part - XI

Humberto Gonzalez-Diaz

CURRENT TOPICS IN MEDICINAL CHEMISTRY (2021)

Article Chemistry, Multidisciplinary

Improving computational modeling coupled with ion mobility-mass spectrometry data for efficient drug metabolite structural determination

Dmytro A. Ivashchenko, Nuno M. F. S. A. Cerqueira, Alexandre L. Magalhaes

Summary: Ion mobility-mass spectrometry, combined with a computational approach, has shown reliable identification of various compounds; while experimental advancements have been made, theoretical improvements have been slow; the work contributes to improving consistency between different theoretical results and between theoretical and experimental values.

STRUCTURAL CHEMISTRY (2021)

Article Chemistry, Physical

Reaction Mechanism of MHETase, a PET Degrading Enzyme

Alexandre V. Pinto, Pedro Ferreira, Rui P. P. Neves, Pedro A. Fernandes, Maria J. Ramos, Alexandre L. Magalhaes

Summary: This study analyzed the reaction mechanism of MHETase and found that it catalyzes the conversion of MHET in two steps, with a rate-limiting step activation barrier of 19.35 kcal/mol. The results supported the hypothesis that a transient tetrahedral intermediate mediates the reaction mechanism.

ACS CATALYSIS (2021)

Article Energy & Fuels

Multi-output chemometrics model for gasoline compounding

Harbil Bediaga, Maria Isabel Moreno, Sonia Arrasate, Jose Luis Vilas, Lucia Orbe, Elias Unzueta, Juan Perez Mercader, Humberto Gonzalez-Diaz

Summary: This study developed an IFPTML model for classifying gasoline samples using Information Fusion, Perturbation Theory, and Machine Learning algorithms, with over 230,000 outcomes from a petroleum refinery plant. The model showed high sensitivity and specificity on training and validation sets, as well as robustness to changes in experimental techniques.
Article Chemistry, Multidisciplinary

Towards rational nanomaterial design by predicting drug-nanoparticle system interaction vs. bacterial metabolic networks

Karel Dieguez-Santana, Bakhtiyor Rasulev, Humberto Gonzalez-Diaz

Summary: This paper introduces an application of information fusion perturbation-theory machine learning method in antibacterial drug-nanoparticle systems. The method accelerates the testing of bacterial sensitivity to different strains and shows good predictive performance. Additionally, the concept of MDR computational surveillance for detecting multidrug-resistant strains is introduced.

ENVIRONMENTAL SCIENCE-NANO (2022)

Article Environmental Sciences

Prediction of acute toxicity of pesticides for Americamysis bahia using linear and nonlinear QSTR modelling approaches

Karel Dieguez-Santana, Manuel Mesias Nachimba-Mayanchi, Amilkar Puris, Roldan Torres Gutierrez, Humberto Gonzalez-Diaz

Summary: This study developed Quantitative Structure-Toxicity Relationship (QSTR) models using multiple statistical models and machine learning algorithms, and found that the Random Forest regression model was the most superior. The results suggest that the developed QSTR models can reliably predict pesticide toxicity in Americamysis bahia, and can be applied in pesticide screening and prioritization.

ENVIRONMENTAL RESEARCH (2022)

Article Chemistry, Medicinal

Prediction of Antileishmanial Compounds: General Model, Preparation, and Evaluation of 2-Acylpyrrole Derivatives

Carlos Santiago, Bernabe Ortega-Tenezaca, Iratxe Barbolla, Brenda Fundora-Ortiz, Sonia Arrasate, Maria Auxiliadora Dea-Ayuela, Humberto Gonzalez-Diaz, Nuria Sotomayor, Esther Lete

Summary: In this study, the authors used the SOFT.PTML tool to pre-process a ChEMBL dataset of pre-clinical assays of anti-leishmanial compound candidates. They compared different ML algorithms and found that the IFPTML-LOGR model had excellent specificity and sensitivity values. They illustrated the use of the software with a practical case study and identified compounds with potential activity. They also performed a computational high-throughput screening and validated the accuracy of the model.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2022)

Article Medicine, Research & Experimental

Machine Learning Study of Metabolic Networks vs ChEMBL Data of Antibacterial Compounds

Karel Dieguez-Santana, Gerardo M. Casanola-Martin, Roldan Torres, Bakhtiyor Rasulev, James R. Green, Humbert Gonzalez-Diaz

Summary: This study utilized the IFPTML algorithm to analyze a large dataset from the ChEMBL database, investigating the interaction between antibacterial drugs and bacterial metabolic networks. The results showed that both linear and nonlinear models had good statistical parameters and were able to predict antibacterial compounds, potentially leading to the discovery of new metabolic mutations in antibiotic resistance.

MOLECULAR PHARMACEUTICS (2022)

Article Chemistry, Medicinal

A Fuzzy System Classification Approach for QSAR Modeling of α-Amylase and α-Glucosidase Inhibitors

Karel Dieguez-Santana, Amilkar Puris, Oscar M. Rivera-Borroto, Gerardo M. Casanola-Martin, Bakhtiyor Rasulev, Humberto Gonzalez-Diaz

Summary: This study proposes a new machine learning algorithm called FURIA-C for classifying drug-like compounds with antidiabetic inhibitory ability. The algorithm achieved satisfactory accuracy scores and derived fuzzy rules with high Certainty Factor values. Comparison tests showed that FURIA-C outperforms other methods, making it a cutting-edge technique for predicting the inhibitory activity of new compounds and speeding up the discovery of multi-target antidiabetic agents.

CURRENT COMPUTER-AIDED DRUG DESIGN (2022)

Article Chemistry, Physical

Development of Nanoscale Graphene Oxide Models for the Adsorption of Biological Molecules

Alexandre V. Pinto, Pedro Ferreira, Pedro A. . Fernandes, Alexandre L. Magalhaes, Maria J. Ramos

Summary: In this paper, a new molecular dynamics (MD) model is proposed to accurately describe the structure of graphene oxide (GO) and its interaction with a solvent and other adsorbate molecules. The new force field parameters are derived through linear-scaling density functional theory calculations, which better reproduce the solvent structure observed in ab initio MD simulations. The effect of ionic strength and the carbon-to-oxygen ratio on the distribution of charges surrounding GO sheets is also analyzed, and the force field is validated by simulating the adsorption of natural amino acid molecules to GO sheets and estimating their free energy of binding.

JOURNAL OF PHYSICAL CHEMISTRY B (2022)

Article Chemistry, Applied

Selective adsorption and separation of light hydrocarbon gases in VI/IV dipeptide crystals

K. Biernacki, J. Lopes, R. Afonso, A. Mendes, L. Gales, A. L. Magalhaes

Summary: The study demonstrates that microporous crystals of L-Isoleucyl-L-Valine and L-Valyl-L-Isoleucine can effectively distinguish between propane and propylene mixtures. Despite having similar pore diameters, L-Valyl-L-Isoleucine shows higher separation selectivity.

MICROPOROUS AND MESOPOROUS MATERIALS (2022)

Article Biology

Machine learning in antibacterial discovery and development: A bibliometric and network analysis of research hotspots and trends

Karel Dieguez-Santana, Humberto Gonzalez-Diaz

Summary: This article utilizes machine learning methods to predict the activity of unknown drugs and discover potential antibacterial drugs. Through a bibliometric study of 1596 Scopus documents from 2006 to 2022, the contributions of leading authors, universities/organizations, and countries are analyzed in terms of productivity, citations, and bibliographic linkage. Essential topics related to the application of machine learning in antibacterial development are identified, and emerging topics are proposed. The applied methodology contributes to a broader and more specific understanding of machine learning research in antibacterial studies for future projects.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Chemistry, Multidisciplinary

Towards machine learning discovery of dual antibacterial drug-nanoparticle systems

Karel Dieguez-Santana, Humberto Gonzalez-Diaz

Summary: The study utilizes Artificial Intelligence and Machine Learning algorithms to accelerate the design of systems composed of antibacterial drugs and nanoparticles, analyzing a large dataset. Training alternative models with different algorithms, as well as studying simulated behavior of DADNPs in biological assays.

NANOSCALE (2021)

Article Biology

A framework for relating natural movement to length and quality of life in human and non-human animals

Iain Hunter, Raz Leib

Summary: Natural movement is related to health, but it is difficult to measure. Existing methods cannot capture the full range of natural movement. Comparing movement across different species helps identify common biomechanical and computational principles. Developing a system to quantify movement in freely moving animals in natural environments and relating it to life quality is crucial. This study proposes a theoretical framework based on movement ability and validates it in Drosophila.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

A geometric approach to the evolution of altruism

Andy Gardner

Summary: Fisher's geometric model is a useful tool for predicting key properties of Darwinian adaptation, and here it is applied to predict differences between the evolution of altruistic versus nonsocial phenotypes. The results suggest that the effect size maximizing probability of fixation is smaller in the context of altruism and larger in the context of nonsocial phenotypes, leading to lower overall probability of fixation for altruism and higher overall probability of fixation for nonsocial phenotypes.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

A mathematical framework for the emergence of winners and losers in cell competition

Thomas F. Pak, Joe Pitt-Francis, Ruth E. Baker

Summary: Cell competition is a process where cells interact in multicellular organisms to determine a winner or loser status, with loser cells being eliminated through programmed cell death. The winner cells then populate the tissue. The outcome of cell competition is context-dependent, as the same cell type can win or lose depending on the competing cell type. This paper proposes a mathematical framework to study the emergence of winner or loser status, highlighting the role of active cell death and identifying the factors that drive cell competition in a cell-based modeling context.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

The eco-evolutionary dynamics of Batesian mimicry

Haruto Tomizuka, Yuuya Tachiki

Summary: Batesian mimicry is a strategy in which palatable prey species resemble unpalatable prey species to avoid predation. The evolution of this mimicry plays a crucial role in protecting the unpalatable species from extinction.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

Gene drives for the extinction of wild metapopulations

Jason W. Olejarz, Martin A. Nowak

Summary: Gene drive technology shows potential for population control, but its release may have unpredictable consequences. The study suggests that the failure of suppression is a natural outcome, and there are complex dynamics among wild populations.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

Intelligent phenotype-detection and gene expression profile generation with generative adversarial networks

Hamid Ravaee, Mohammad Hossein Manshaei, Mehran Safayani, Javad Salimi Sartakhti

Summary: Gene expression analysis is valuable for cancer classification and phenotype identification. IP3G, based on Generative Adversarial Networks, enhances gene expression data and discovers phenotypes in an unsupervised manner. By converting gene expression profiles into images and utilizing IP3G, new phenotype profiles can be generated, improving classification accuracy.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

Network-based uncertainty quantification for mathematical models in epidemiology

Beatrix Rahnsch, Leila Taghizadeh

Summary: This study forecasts the evolution of the COVID-19 pandemic in Germany using a network-based inference method and compares it with other approaches. The results show that the network-inference based approach outperforms other methods in short-to mid-term predictions, even with limited information about the new disease. Furthermore, predictions based on the estimation of the reproduction number in Germany can yield more reliable results with increasing data availability, but still cannot surpass the network-inference based algorithm.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

Dynamics of cell-type transition mediated by epigenetic modifications

Rongsheng Huang, Qiaojun Situ, Jinzhi Lei

Summary: Maintaining tissue homeostasis requires appropriate regulation of stem cell differentiation. Random inheritance of epigenetic states plays a pivotal role in stem cell differentiation. This computational model provides valuable insights into the intricate mechanism governing stem cell differentiation and cell reprogramming, offering a promising path for enhancing the field of regenerative medicine.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

Comparative analysis of kinetic realizations of insulin signaling

Patrick Vincent N. Lubenia, Eduardo R. Mendoza, Angelyn R. Lao

Summary: This study compares insulin signaling in healthy and type 2 diabetes states using reaction network analysis. The results show similarities and differences between the two conditions, providing insights into the mechanisms of insulin resistance, including the involvement of other complexes, less restrictive interplay between species, and loss of concentration robustness in GLUT4.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

Simulating tumor volume dynamics in response to radiotherapy: Implications of model selection

Nuverah Mohsin, Heiko Enderling, Renee Brady-Nicholls, Mohammad U. Zahid

Summary: Mathematical modeling is crucial in understanding radiobiology and designing treatment approaches in radiotherapy for cancer. This study compares three tumor volume dynamics models and analyzes the implications of model selection. A new metric, the point of maximum reduction of tumor volume (MRV), is introduced to quantify the impact of radiotherapy. The results emphasize the importance of caution in selecting models of response to radiotherapy due to the artifacts imposed by each model.

JOURNAL OF THEORETICAL BIOLOGY (2024)

Article Biology

Pillars of theoretical biology: Biochemical systems analysis, I, II and III

Armindo Salvador

Summary: Michael Savageau's Biochemical Systems Analysis papers have had a significant impact on Systems Biology, generating core concepts and tools. This article provides a brief summary of these papers and discusses the most relevant developments in Biochemical Systems Theory since their publication.

JOURNAL OF THEORETICAL BIOLOGY (2024)