Article
Chemistry, Multidisciplinary
Hiromasa Kaneko
Summary: This paper proposes a method based on SELFIES for molecular descriptors, structure generation, and inverse QSAR/QSPR. By converting SELFIES into SELFIES descriptors x, an inverse analysis of the QSAR/QSPR model y = f(x) is conducted, obtaining x values that achieve the target y value and successfully generating SELFIES strings or molecules.
Article
Chemistry, Multidisciplinary
Natalia Piekus-Slomka, Mariusz Zapadka, Bogumila Kupcewicz
Summary: A quantitative structure-activity relationship (QSAR) model using multiple linear regression (MLR) was developed to predict the inhibitory ability of methyl and/or methylthio trans-stilbene derivatives on CYP1B1. The model showed good validation parameters and a reliable prediction range. Specific molecular descriptors were identified to predict the inhibitory activity, and the impact of molecule partitioning on descriptors was explored.
ARABIAN JOURNAL OF CHEMISTRY
(2022)
Article
Biochemistry & Molecular Biology
Mariusz Zapadka, Przemyslaw Dekowski, Bogumila Kupcewicz
Summary: Among drug design methods, using molecular descriptors for quantitative structure-activity relationships (QSAR) shows potential for predicting molecular structures with specific pharmacological activity. However, interpreting QSAR models can be challenging due to the complexity of molecular descriptors. This study focuses on interpreting the H-GETAWAY descriptor for 4-thiazolidinone derivatives with antitrypanosomal activity, offering insights into the impact of molecular features on H-GETAWAY values.
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2022)
Article
Agriculture, Multidisciplinary
Alondra M. Idrovo-Encalada, Ana M. Rojas, Eliana N. Fissore, Piercosimo Tripaldi, Reinaldo Pis Diez, Cristian Rojas
Summary: This article introduces a quantitative structure-activity relationship (QSARs) method for investigating the antioxidant activity and establishes a database of 165 compounds for predicting new antioxidant compounds. By building a multiple linear regression model with five descriptors, a satisfactory prediction performance is achieved, and a mechanism explanation and applicability domain are provided.
JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE
(2023)
Article
Biochemistry & Molecular Biology
Wissal Liman, Mehdi Oubahmane, Ismail Hdoufane, Imane Bjij, Didier Villemin, Rachid Daoud, Driss Cherqaoui, Achraf El Allali
Summary: This study identifies new potential compounds for possible medical use against HCV using QSAR method and successfully developed corresponding QSAR models. These results are of great importance for accelerating the discovery of new drugs against HCV.
Article
Chemistry, Multidisciplinary
Kevin Bonnot, Pierre Benoit, Sophie Hoyau, Laure Mamy, Dominique Patureau, Remi Servien, Mathias Rapacioli, Fabienne Bessac
Summary: The quantitative structure activity relationship (QSAR) methodology is a predictive method for unknown environmental data based on descriptors. Quantum-based 3D descriptors are considered promising tools for predicting macroscopic environmental properties. Comparing computational chemistry strategies on a set of pharmaceuticals and personal care products, it is found that a comprehensive conformational search and accurate potential for local quenches are necessary. Accurate and tractable quantum-based 3D descriptors can be calculated using a specific method.
Review
Biochemistry & Molecular Biology
Paula Carracedo-Reboredo, Jose Linares-Blanco, Nereida Rodriguez-Fernandez, Francisco Cedron, Francisco J. Novoa, Adrian Carballal, Victor Maojo, Alejandro Pazos, Carlos Fernandez-Lozano
Summary: In recent years, machine learning techniques have been widely used in drug discovery to improve efficiency and reduce costs. To achieve the goals set by the Precision Medicine initiative, higher requirements have been proposed for the robustness, standardization, and reproducibility of computational methods.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2021)
Article
Computer Science, Artificial Intelligence
Andrea Mastropietro, Giuseppe Pasculli, Juergen Bajorath
Summary: In drug design, graph neural networks (GNNs) are used to predict compound potency by analyzing graph representations of protein-ligand interactions. The study reveals that GNNs are influenced by ligand memorization during learning and certain GNN architectures prioritize interaction information for predicting high affinities. While GNNs do not comprehensively account for protein-ligand interactions and physical reality, they provide a helpful balance between ligand memorization and learning of interaction patterns.
NATURE MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Interdisciplinary Applications
Dingchao Fan, Ke Xue, Yangyang Liu, Wenguang Zhu, Yusen Chen, Peizhe Cui, Shiqin Sun, Jianguang Qi, Zhaoyou Zhu, Yinglong Wang
Summary: This study proposed a hybrid molecular descriptor combined with deep learning method to model and evaluate the rational application of ionic liquids (ILs) through deep convolutional neural networks (DCNN). A toxicity dataset of ILs against leukemia rat cell line (ICP-81) was collected, and the DCNN model was optimized using Bayesian optimization and local search algorithm. The results showed that the proposed model had a satisfactory prediction accuracy, providing guidance for ILs screening and rational industrial application.
COMPUTERS & CHEMICAL ENGINEERING
(2023)
Article
Chemistry, Multidisciplinary
Min Wei, Xudong Zhang, Xiaolin Pan, Bo Wang, Changge Ji, Yifei Qi, John Z. H. Zhang
Summary: This study developed a classifying model for evaluating the oral bioavailability of drug molecules based on the consensus predictions of five random forest models. The model showed excellent prediction accuracies on two independent test sets and identified main molecular descriptors affecting the HOB class value. The model is available as a web server for quick assessment of oral bioavailability for small molecules.
JOURNAL OF CHEMINFORMATICS
(2022)
Article
Environmental Sciences
Katha S. Hirpara, Upendra D. Patel
Summary: Electrochemical oxidation is an efficient method for dye removal in wastewater. The experimental conditions and dye structure greatly affect the degradation extent. Quantitative structure-activity relationship (QSAR) models can be used to predict the extent of degradation conveniently. In this study, published experimental data on electrochemical oxidation of aqueous dyes were normalized and used to develop QSAR models for percent color and COD removal. The developed models are simple, interpretable, and transparent.
ENVIRONMENTAL TECHNOLOGY
(2023)
Article
Chemistry, Physical
Nilima Rani Das, Sneha Prabha Mishra, P. Ganga Raju Achary
Summary: This study constructed QSAR models for predicting pEC50(M) for the A(2A) adenosine receptor using different descriptors, showing satisfactory performance. The models were evaluated using statistical parameters and validation techniques, demonstrating stable performance.
JOURNAL OF MOLECULAR STRUCTURE
(2021)
Article
Biochemistry & Molecular Biology
Shola Elijah Adeniji, David Ebuka Arthur, Mustapha Abdullahi, Ayuba Abdullahi, Fabian Audu Ugbe
Summary: This study uses modeling techniques to predict the inhibition activities of drugs against multi-drug resistant-tuberculosis. Multiple regression and genetic function approximation were used to create a model with topological descriptors. Molecular docking was used to evaluate the interactions between compounds and the target protein. Compound 20 and compound 20j were found to have prominent activity and can be used as reference structures for designing anti-tuberculosis drugs.
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS
(2022)
Article
Mathematics, Applied
F. Revuelta, F. J. Arranz, R. M. Benito, F. Borondo
Summary: Using Lagrangian descriptors, we identify the phase-space structures responsible for the chaotic dynamics observed in the KCN molecular system. The vibrational dynamics of this molecule are strongly influenced by the invariant manifolds associated with a specific stretching periodic orbit. Additionally, we analyze the representation of these invariant manifolds on a Poincaré surface of section and find that its intricate depiction is a result of the complex behavior of the manifolds caused by strong anharmonicities in the potential energy surface.
COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION
(2023)
Article
Chemistry, Medicinal
Hamid Safizadeh, Scott W. Simpkins, Justin Nelson, Sheena C. Li, Jeff S. Piotrowski, Mami Yoshimura, Yoko Yashiroda, Hiroyuki Hirano, Hiroyuki Osada, Minoru Yoshida, Charles Boone, Chad L. Myers
Summary: The study systematically benchmarked 11 different molecular fingerprint encodings combined with 13 different similarity coefficients using chemical-genetic interaction data from yeast as a proxy for biological activity. It found that the all-shortest path fingerprints paired with the Braun-Blanquet similarity coefficient provided superior performance across different compound collections. Additionally, a machine learning pipeline based on support vector machines offered a fivefold improvement relative to the best unsupervised approach, indicating the potential of using high-dimensional chemical-genetic data for improving prediction of biological functions from chemical structures.
JOURNAL OF CHEMICAL INFORMATION AND MODELING
(2021)
Article
Environmental Sciences
Veronica Pinos-Velez, Giuliana S. Araujo, Gabriel M. Moulatlet, Andres Perez-Gonzalez, Isabel Cipriani-Avila, Piercosimo Tripaldi, Mariana Capparelli
Summary: The toxicity effects of emerging contaminants on environmental health, particularly mixtures, were assessed through testing single and composite mixtures of triclosan (T), 17 beta-estradiol (E2), sulfamethoxazole (SMX), and nicotine (N) on neonates of Daphnia magna. When tested individually, T and N showed high toxicity (100% immobility each), while SMX and E2 exhibited lower toxicity (2.5% and 10% immobility, respectively). In mixture exposures, T was found to have the highest contribution to overall toxicity. However, the toxicity of N decreased when exposed at four times the concentration (85% immobility). These findings serve as a warning about the use and release of T and N in the aquatic ecosystem due to their high toxicity, both individually and in mixtures.
BULLETIN OF ENVIRONMENTAL CONTAMINATION AND TOXICOLOGY
(2023)
Article
Biochemistry & Molecular Biology
Fabio Gosetti, Viviana Consonni, Davide Ballabio, Marco Emilio Orlandi, Angelo Amodio, Maria Valeria Picci, Marco Visentin, Veronica Termopoli
Summary: According to the 2021 World Drug Report, there are approximately 275 million drug users and 36 million people suffering from addiction worldwide, leading to a flourishing illicit drug market. In Italy, a significant number of people (30,083) were reported for violating the Italian Law D.P.R. 309/1990. However, the forensic laboratories in Italy face challenges in completing drug analyses quickly, causing delays in the reporting process for sentencing. To address this issue, a UHPLC-MS/MS-based platform was developed at the University of Milano-Bicocca, which enables law enforcement authorities to manage street seizures and generate accurate reports efficiently.
Article
Multidisciplinary Sciences
Michael Moret, Irene Pachon Angona, Leandro Cotos, Shen Yan, Kenneth Atz, Cyrill Brunner, Martin Baumgartner, Francesca Grisoni, Gisbert Schneider
Summary: Generative chemical language models (CLMs) can be used to generate new molecular structures from a textual representation. Hybrid CLMs can leverage bioactivity information for training compounds. In this study, a virtual compound library was created using a generative CLM and refined using a CLM-based classifier for bioactivity prediction. A new PI3K gamma ligand with sub-micromolar activity was identified, highlighting the potential of hybrid CLMs for molecular design.
NATURE COMMUNICATIONS
(2023)
Review
Biochemistry & Molecular Biology
R. Ozcelik, D. van Tilborg, J. Jimenez-Luna, F. Grisoni
Summary: Artificial intelligence (AI) in the form of deep learning is promising for drug discovery and chemical biology, especially in protein structure prediction, organic synthesis planning, and molecule design. While most efforts have focused on ligand-based approaches, structure-based drug discovery has the potential to address unsolved challenges such as affinity prediction for new protein targets and understanding chemical kinetic properties. Advances in deep learning methodologies and accurate protein structure predictions support a resurgence in structure-based approaches guided by AI. This review summarizes key algorithmic concepts in structure-based deep learning for drug discovery and discusses future opportunities, applications, and challenges.
Article
Biochemistry & Molecular Biology
Francesca Grisoni
Summary: Generative deep learning is revolutionizing de novo drug design by enabling the generation of molecules with specific properties. Chemical language models, which use deep learning to generate new molecules as strings, have been remarkably successful in this endeavor. With advances in natural language processing and interdisciplinary collaborations, chemical language models are expected to play a key role in the future of drug discovery.
CURRENT OPINION IN STRUCTURAL BIOLOGY
(2023)
Correction
Chemistry, Medicinal
Derek van Tilborg, Alisa Alenicheva, Francesca Grisoni
JOURNAL OF CHEMICAL INFORMATION AND MODELING
(2023)
Article
Chemistry, Medicinal
Marco Ballarotto, Sabine Willems, Tanja Stiller, Felix Nawa, Julian A. A. Marschner, Francesca Grisoni, Daniel Merk
Summary: Generative neural networks trained on SMILES can design innovative bioactive molecules de novo. These models have usually been fine-tuned on template molecules but it is challenging to apply them to orphan targets with few known ligands.
JOURNAL OF MEDICINAL CHEMISTRY
(2023)
Article
Chemistry, Multidisciplinary
See Yoong Wong, Andrew L. Hook, Wil Gardner, Chien-Yi Chang, Ying Mei, Martyn C. Davies, Paul Williams, Morgan R. Alexander, Davide Ballabio, Benjamin W. Muir, David A. Winkler, Paul J. Pigram
Summary: Biofilm formation is a major problem in hospitals, and researching biofilm-resistant materials is critical. Polymer microarrays can efficiently discover new biofilm-resistant polymers. This study investigates bacterial attachment and surface chemistry on a polymer microarray to better understand Pseudomonas aeruginosa biofilm formation. Analyzing the data using linear multivariate analysis and a nonlinear self-organizing map reveals fragment ions associated with bacterial biofilm formation. Considering these insights, a second analysis is conducted that explicitly considers interactions between key fragments. This improved model provides chemical insights for designing materials that prevent pathogen attachment.
ADVANCED MATERIALS INTERFACES
(2023)
Review
Chemistry, Analytical
Veronica Termopoli, Maurizio Piergiovanni, Davide Ballabio, Viviana Consonni, Emmanuel Cruz Munoz, Fabio Gosetti
Summary: Membrane introduction mass spectrometry (MIMS) is a direct mass spectrometry technique that allows online monitoring and rapid quantification of trace levels of compounds in complex matrices without extensive sample preparation steps and chromatographic separation. It uses a thin, semi-permeable, and selective membrane to pre-concentrate analytes based on their physicochemical properties, and transfers them directly to the mass spectrometer using different acceptor phases.
Review
Food Science & Technology
Cristian Rojas, Davide Ballabio, Viviana Consonni, Diego Suarez-Estrella, Roberto Todeschini
Summary: The ability to distinguish safe and dangerous compounds is crucial for the evolution of species, including humans. Taste receptors, such as taste buds, provide valuable information about the substances we consume orally. Classification-based machine learning methods can predict the taste of new molecules based on their chemical structure. This review summarizes the history of multicriteria quantitative structure-taste relationship modeling, from the first ligand-based classifier proposed in 1980 to the latest studies in 2022.
FOOD RESEARCH INTERNATIONAL
(2023)
Article
Materials Science, Coatings & Films
Wil Gardner, David A. Winkler, David L. J. Alexander, Davide Ballabio, Benjamin W. Muir, Paul J. Pigram
Summary: In this study, two different ToF-SIMS imaging datasets were used to evaluate the impact of data preprocessing methods and SOM hyperparameters on the performance of SOM. It was found that preprocessing is generally more important than hyperparameter selection, and there are complex interactions between different parameters. The results of this study are important for understanding the effects of data processing on hyperspectral imaging data.
JOURNAL OF VACUUM SCIENCE & TECHNOLOGY A
(2023)
Article
Chemistry, Analytical
Enmanuel Cruz Munoz, Fabio Gosetti, Davide Ballabio, Sergio Ando, Olivia Gomez-Laserna, Jose Manuel Amigo, Eduardo Garzanti
Summary: In this study, a new method based on Raman hyperspectral imaging and chemometrics is proposed for the analysis of chemically heterogeneous surfaces of weathered minerals. The technique is tested using a pyrite sample with a heterogeneous surface consisting of different alteration products. Principal Component Analysis is used to evaluate data structure and identify weathering features, while Multivariate Curve Resolution-alternating least squares and K-means clustering are employed to identify specific chemical components of major and minor weathering phases, respectively. The method enables a semi-quantitative threshold-based characterization of chemical features and provides a visual representation of the phase distribution on the sample surface.
MICROCHEMICAL JOURNAL
(2023)
Review
Biotechnology & Applied Microbiology
Michael W. Mullowney, Katherine R. Duncan, Somayah S. Elsayed, Neha Garg, Justin J. J. van der Hooft, Nathaniel I. Martin, David Meijer, Barbara R. Terlouw, Friederike Biermann, Kai Blin, Janani Durairaj, Marina Gorostiola Gonzalez, Eric J. N. Helfrich, Florian Huber, Stefan Leopold-Messer, Kohulan Rajan, Tristan de Rond, Jeffrey A. van Santen, Maria Sorokina, Marcy J. Balunas, Mehdi A. Beniddir, Doris A. van Bergeijk, Laura M. Carroll, Chase M. Clark, Djork-Arne Clevert, Chris A. Dejong, Chao Du, Scarlet Ferrinho, Francesca Grisoni, Albert Hofstetter, Willem Jespers, Olga V. Kalinina, Satria A. Kautsar, Hyunwoo Kim, Tiago F. Leao, Joleen Masschelein, Evan R. Rees, Raphael Reher, Daniel Reker, Philippe Schwaller, Marwin Segler, Michael A. Skinnider, Allison S. Walker, Egon L. Willighagen, Barbara Zdrazil, Nadine Ziemert, Rebecca J. M. Goss, Pierre Guyomard, Andrea Volkamer, William H. Gerwick, Hyun Uk Kim, Rolf Mueller, Gilles P. van Wezel, Gerard J. P. van Westen, Anna K. H. Hirsch, Roger G. Linington, Serina L. Robinson, Marnix H. Medema
Summary: The developments in computational omics technologies in combination with artificial intelligence approaches have opened up new possibilities for drug discovery. However, addressing key challenges such as high-quality datasets and algorithm validation is essential to realize the potential of these synergies.
NATURE REVIEWS DRUG DISCOVERY
(2023)
Article
Chemistry, Multidisciplinary
Ana Ortiz-Perez, Cristina Izquierdo-Lozano, Rens Meijers, Francesca Grisoni, Lorenzo Albertazzi
Summary: Barcoding is a powerful tool to distinguish multiple targets within a complex mixture and increase assay throughput. While fluorescent barcoding of microparticles is widely used, it is more challenging for nanoparticles due to their small size and heterogeneity. In this study, a machine-learning-assisted workflow was developed to write, read, and classify barcoded PLGA-PEG nanoparticles at a single-particle level.
NANOSCALE ADVANCES
(2023)
Article
Automation & Control Systems
Giacomo Baccolo, Huiwen Yu, Cecile Valsecchi, Davide Ballabio, Rasmus Bro
Summary: Hyphenated chromatography is a popular analytical technique in omics related research, but extracting relevant information from chromatographic data is challenging. In this study, three classification algorithms were used to automatically identify GC-MS elution profiles resolved by PARAFAC2, and the input data quality was found to be crucial for modeling performance. The results suggest that neural networks are the best solution for this specific classification task.
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
(2023)