4.6 Article

A QSTR-Based Expert System to Predict Sweetness of Molecules

Journal

FRONTIERS IN CHEMISTRY
Volume 5, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fchem.2017.00053

Keywords

sweetness; QSAR; molecular descriptors; classification; expert system

Funding

  1. National Secretary of Higher Education, Science, Technology and Innovation (SENESCYT) from the Republic of Ecuador

Ask authors/readers for more resources

This work describes a novel approach based on advanced molecular similarity to predict the sweetness of chemicals. The proposed Quantitative Structure-Taste Relationship (QSTR) model is an expert system developed keeping in mind the five principles defined by the Organization for Economic Co-operation and Development (OECD) for the validation of (Q)SARs. The 649 sweet and non-sweet molecules were described by both conformation-independent extended-connectivity fingerprints (ECFPs) and molecular descriptors. In particular, the molecular similarity in the ECFPs space showed a clear association with molecular taste and it was exploited for model development. Molecules laying in the subspaces where the taste assignation was more difficult were modeled trough a consensus between linear and local approaches (Partial Least Squares-Discriminant Analysis and N-nearest-neighbor classifier). The expert system, which was thoroughly validated through a Monte Carlo procedure and an external set, gave satisfactory results in comparison with the state-of-the-art models. Moreover, the QSTR model can be leveraged into a greater understanding of the relationship between molecular structure and sweetness, and into the design of novel sweeteners.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Environmental Sciences

Acute Toxicity of Daphnia magna Neonates Exposed to Single and Composite Mixtures of Four Emerging Contaminants

Veronica Pinos-Velez, Giuliana S. Araujo, Gabriel M. Moulatlet, Andres Perez-Gonzalez, Isabel Cipriani-Avila, Piercosimo Tripaldi, Mariana Capparelli

Summary: The toxicity effects of emerging contaminants on environmental health, particularly mixtures, were assessed through testing single and composite mixtures of triclosan (T), 17 beta-estradiol (E2), sulfamethoxazole (SMX), and nicotine (N) on neonates of Daphnia magna. When tested individually, T and N showed high toxicity (100% immobility each), while SMX and E2 exhibited lower toxicity (2.5% and 10% immobility, respectively). In mixture exposures, T was found to have the highest contribution to overall toxicity. However, the toxicity of N decreased when exposed at four times the concentration (85% immobility). These findings serve as a warning about the use and release of T and N in the aquatic ecosystem due to their high toxicity, both individually and in mixtures.

BULLETIN OF ENVIRONMENTAL CONTAMINATION AND TOXICOLOGY (2023)

Article Biochemistry & Molecular Biology

From the Streets to the Judicial Evidence: Determination of Traditional Illicit Substances in Drug Seizures by a Rapid and Sensitive UHPLC-MS/MS-Based Platform

Fabio Gosetti, Viviana Consonni, Davide Ballabio, Marco Emilio Orlandi, Angelo Amodio, Maria Valeria Picci, Marco Visentin, Veronica Termopoli

Summary: According to the 2021 World Drug Report, there are approximately 275 million drug users and 36 million people suffering from addiction worldwide, leading to a flourishing illicit drug market. In Italy, a significant number of people (30,083) were reported for violating the Italian Law D.P.R. 309/1990. However, the forensic laboratories in Italy face challenges in completing drug analyses quickly, causing delays in the reporting process for sentencing. To address this issue, a UHPLC-MS/MS-based platform was developed at the University of Milano-Bicocca, which enables law enforcement authorities to manage street seizures and generate accurate reports efficiently.

MOLECULES (2023)

Article Multidisciplinary Sciences

Leveraging molecular structure and bioactivity with chemical language models for de novo drug design

Michael Moret, Irene Pachon Angona, Leandro Cotos, Shen Yan, Kenneth Atz, Cyrill Brunner, Martin Baumgartner, Francesca Grisoni, Gisbert Schneider

Summary: Generative chemical language models (CLMs) can be used to generate new molecular structures from a textual representation. Hybrid CLMs can leverage bioactivity information for training compounds. In this study, a virtual compound library was created using a generative CLM and refined using a CLM-based classifier for bioactivity prediction. A new PI3K gamma ligand with sub-micromolar activity was identified, highlighting the potential of hybrid CLMs for molecular design.

NATURE COMMUNICATIONS (2023)

Review Biochemistry & Molecular Biology

Structure-Based Drug Discovery with Deep Learning

R. Ozcelik, D. van Tilborg, J. Jimenez-Luna, F. Grisoni

Summary: Artificial intelligence (AI) in the form of deep learning is promising for drug discovery and chemical biology, especially in protein structure prediction, organic synthesis planning, and molecule design. While most efforts have focused on ligand-based approaches, structure-based drug discovery has the potential to address unsolved challenges such as affinity prediction for new protein targets and understanding chemical kinetic properties. Advances in deep learning methodologies and accurate protein structure predictions support a resurgence in structure-based approaches guided by AI. This review summarizes key algorithmic concepts in structure-based deep learning for drug discovery and discusses future opportunities, applications, and challenges.

CHEMBIOCHEM (2023)

Article Biochemistry & Molecular Biology

Chemical language models for de novo drug design: Challenges and opportunities

Francesca Grisoni

Summary: Generative deep learning is revolutionizing de novo drug design by enabling the generation of molecules with specific properties. Chemical language models, which use deep learning to generate new molecules as strings, have been remarkably successful in this endeavor. With advances in natural language processing and interdisciplinary collaborations, chemical language models are expected to play a key role in the future of drug discovery.

CURRENT OPINION IN STRUCTURAL BIOLOGY (2023)

Correction Chemistry, Medicinal

Exposing the Limitations of Molecular Machine Learning with Activity Cliffs (vol 62, pg 5938, 2022)

Derek van Tilborg, Alisa Alenicheva, Francesca Grisoni

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2023)

Article Chemistry, Medicinal

De Novo Design of Nurr1 Agonists via Fragment-Augmented Generative Deep Learning in Low-Data Regime

Marco Ballarotto, Sabine Willems, Tanja Stiller, Felix Nawa, Julian A. A. Marschner, Francesca Grisoni, Daniel Merk

Summary: Generative neural networks trained on SMILES can design innovative bioactive molecules de novo. These models have usually been fine-tuned on template molecules but it is challenging to apply them to orphan targets with few known ligands.

JOURNAL OF MEDICINAL CHEMISTRY (2023)

Article Chemistry, Multidisciplinary

Exploring the Relationship between Polymer Surface Chemistry and Bacterial Attachment Using ToF-SIMS and Self-Organizing maps

See Yoong Wong, Andrew L. Hook, Wil Gardner, Chien-Yi Chang, Ying Mei, Martyn C. Davies, Paul Williams, Morgan R. Alexander, Davide Ballabio, Benjamin W. Muir, David A. Winkler, Paul J. Pigram

Summary: Biofilm formation is a major problem in hospitals, and researching biofilm-resistant materials is critical. Polymer microarrays can efficiently discover new biofilm-resistant polymers. This study investigates bacterial attachment and surface chemistry on a polymer microarray to better understand Pseudomonas aeruginosa biofilm formation. Analyzing the data using linear multivariate analysis and a nonlinear self-organizing map reveals fragment ions associated with bacterial biofilm formation. Considering these insights, a second analysis is conducted that explicitly considers interactions between key fragments. This improved model provides chemical insights for designing materials that prevent pathogen attachment.

ADVANCED MATERIALS INTERFACES (2023)

Review Chemistry, Analytical

Condensed Phase Membrane Introduction Mass Spectrometry: A Direct Alternative to Fully Exploit the Mass Spectrometry Potential in Environmental Sample Analysis

Veronica Termopoli, Maurizio Piergiovanni, Davide Ballabio, Viviana Consonni, Emmanuel Cruz Munoz, Fabio Gosetti

Summary: Membrane introduction mass spectrometry (MIMS) is a direct mass spectrometry technique that allows online monitoring and rapid quantification of trace levels of compounds in complex matrices without extensive sample preparation steps and chromatographic separation. It uses a thin, semi-permeable, and selective membrane to pre-concentrate analytes based on their physicochemical properties, and transfers them directly to the mass spectrometer using different acceptor phases.

SEPARATIONS (2023)

Review Food Science & Technology

Classification-based machine learning approaches to predict the taste of molecules: A review

Cristian Rojas, Davide Ballabio, Viviana Consonni, Diego Suarez-Estrella, Roberto Todeschini

Summary: The ability to distinguish safe and dangerous compounds is crucial for the evolution of species, including humans. Taste receptors, such as taste buds, provide valuable information about the substances we consume orally. Classification-based machine learning methods can predict the taste of new molecules based on their chemical structure. This review summarizes the history of multicriteria quantitative structure-taste relationship modeling, from the first ligand-based classifier proposed in 1980 to the latest studies in 2022.

FOOD RESEARCH INTERNATIONAL (2023)

Article Materials Science, Coatings & Films

Effect of data preprocessing and machine learning hyperparameters on mass spectrometry imaging models

Wil Gardner, David A. Winkler, David L. J. Alexander, Davide Ballabio, Benjamin W. Muir, Paul J. Pigram

Summary: In this study, two different ToF-SIMS imaging datasets were used to evaluate the impact of data preprocessing methods and SOM hyperparameters on the performance of SOM. It was found that preprocessing is generally more important than hyperparameter selection, and there are complex interactions between different parameters. The results of this study are important for understanding the effects of data processing on hyperspectral imaging data.

JOURNAL OF VACUUM SCIENCE & TECHNOLOGY A (2023)

Article Chemistry, Analytical

Characterization of pyrite weathering products by Raman hyperspectral imaging and chemometrics techniques

Enmanuel Cruz Munoz, Fabio Gosetti, Davide Ballabio, Sergio Ando, Olivia Gomez-Laserna, Jose Manuel Amigo, Eduardo Garzanti

Summary: In this study, a new method based on Raman hyperspectral imaging and chemometrics is proposed for the analysis of chemically heterogeneous surfaces of weathered minerals. The technique is tested using a pyrite sample with a heterogeneous surface consisting of different alteration products. Principal Component Analysis is used to evaluate data structure and identify weathering features, while Multivariate Curve Resolution-alternating least squares and K-means clustering are employed to identify specific chemical components of major and minor weathering phases, respectively. The method enables a semi-quantitative threshold-based characterization of chemical features and provides a visual representation of the phase distribution on the sample surface.

MICROCHEMICAL JOURNAL (2023)

Review Biotechnology & Applied Microbiology

Artificial intelligence for natural product drug discovery

Michael W. Mullowney, Katherine R. Duncan, Somayah S. Elsayed, Neha Garg, Justin J. J. van der Hooft, Nathaniel I. Martin, David Meijer, Barbara R. Terlouw, Friederike Biermann, Kai Blin, Janani Durairaj, Marina Gorostiola Gonzalez, Eric J. N. Helfrich, Florian Huber, Stefan Leopold-Messer, Kohulan Rajan, Tristan de Rond, Jeffrey A. van Santen, Maria Sorokina, Marcy J. Balunas, Mehdi A. Beniddir, Doris A. van Bergeijk, Laura M. Carroll, Chase M. Clark, Djork-Arne Clevert, Chris A. Dejong, Chao Du, Scarlet Ferrinho, Francesca Grisoni, Albert Hofstetter, Willem Jespers, Olga V. Kalinina, Satria A. Kautsar, Hyunwoo Kim, Tiago F. Leao, Joleen Masschelein, Evan R. Rees, Raphael Reher, Daniel Reker, Philippe Schwaller, Marwin Segler, Michael A. Skinnider, Allison S. Walker, Egon L. Willighagen, Barbara Zdrazil, Nadine Ziemert, Rebecca J. M. Goss, Pierre Guyomard, Andrea Volkamer, William H. Gerwick, Hyun Uk Kim, Rolf Mueller, Gilles P. van Wezel, Gerard J. P. van Westen, Anna K. H. Hirsch, Roger G. Linington, Serina L. Robinson, Marnix H. Medema

Summary: The developments in computational omics technologies in combination with artificial intelligence approaches have opened up new possibilities for drug discovery. However, addressing key challenges such as high-quality datasets and algorithm validation is essential to realize the potential of these synergies.

NATURE REVIEWS DRUG DISCOVERY (2023)

Article Chemistry, Multidisciplinary

Identification of fluorescently-barcoded nanoparticles using machine learning

Ana Ortiz-Perez, Cristina Izquierdo-Lozano, Rens Meijers, Francesca Grisoni, Lorenzo Albertazzi

Summary: Barcoding is a powerful tool to distinguish multiple targets within a complex mixture and increase assay throughput. While fluorescent barcoding of microparticles is widely used, it is more challenging for nanoparticles due to their small size and heterogeneity. In this study, a machine-learning-assisted workflow was developed to write, read, and classify barcoded PLGA-PEG nanoparticles at a single-particle level.

NANOSCALE ADVANCES (2023)

Article Automation & Control Systems

Comparison of machine learning approaches for the classification of elution profiles

Giacomo Baccolo, Huiwen Yu, Cecile Valsecchi, Davide Ballabio, Rasmus Bro

Summary: Hyphenated chromatography is a popular analytical technique in omics related research, but extracting relevant information from chromatographic data is challenging. In this study, three classification algorithms were used to automatically identify GC-MS elution profiles resolved by PARAFAC2, and the input data quality was found to be crucial for modeling performance. The results suggest that neural networks are the best solution for this specific classification task.

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS (2023)

No Data Available