4.6 Article

Classification of alkaloids according to the starting substances of their biosynthetic pathways using graph convolutional neural networks

Journal

BMC BIOINFORMATICS
Volume 20, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/s12859-019-2963-6

Keywords

Molecular graph convolutional neural networks; Alkaloids; Metabolic pathways; Deep learning

Funding

  1. Ministry of Education, Culture, Sports, Science, and Technology of Japan [16K07223, 17K00406]
  2. Platform Project for Supporting Drug Discovery and Life Science Research - Japan Agency for Medical Research and Development [18am0101111]
  3. National Bioscience Database Center (NBDC)
  4. NAIST Bigdata Project
  5. JSPS [17H05297]
  6. Grants-in-Aid for Scientific Research [17H05297] Funding Source: KAKEN

Ask authors/readers for more resources

Background: Alkaloids, a class of organic compounds that contain nitrogen bases, are mainly synthesized as secondary metabolites in plants and fungi, and they have a wide range of bioactivities. Although there are thousands of compounds in this class, few of their biosynthesis pathways are fully identified. In this study, we constructed a model to predict their precursors based on a novel kind of neural network called the molecular graph convolutional neural network. Molecular similarity is a crucial metric in the analysis of qualitative structure-activity relationships. However, it is sometimes difficult for current fingerprint representations to emphasize specific features for the target problems efficiently. It is advantageous to allow the model to select the appropriate features according to data-driven decisions for extracting more useful information, which influences a classification or regression problem substantially. Results: In this study, we applied a neural network architecture for undirected graph representation of molecules. By encoding a molecule as an abstract graph and applying convolution on the graph and training the weight of the neural network framework, the neural network can optimize feature selection for the training problem. By incorporating the effects from adjacent atoms recursively, graph convolutional neural networks can extract the features of latent atoms that represent chemical features of a molecule efficiently. In order to investigate alkaloid biosynthesis, we trained the network to distinguish the precursors of 566 alkaloids, which are almost all of the alkaloids whose biosynthesis pathways are known, and showed that the model could predict starting substances with an averaged accuracy of 97.5%. Conclusion: We have showed that our model can predict more accurately compared to the random forest and general neural network when the variables and fingerprints are not selected, while the performance is comparable when we carefully select 507 variables from 18000 dimensions of descriptors. The prediction of pathways contributes to understanding of alkaloid synthesis mechanisms and the application of graph based neural network models to similar problems in bioinformatics would therefore be beneficial. We applied our model to evaluate the precursors of biosynthesis of 12000 alkaloids found in various organisms and found power-low-like distribution.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Interdisciplinary Applications

The feasibility of predicting impending malignant ventricular arrhythmias by using nonlinear features of short heartbeat intervals

Zheng Chen, Naoaki Ono, Wei Chen, Toshiyo Tamura, Md Altaf-Ul-Amin, Shigehiko Kanaya, Ming Huang

Summary: The study introduced a method to predict malignant ventricular arrhythmias using signal complexity, which was validated through machine learning model experiments. This research provides important theoretical and practical implications for cardiac arrest prevention.

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE (2021)

Editorial Material Biology

Recent Trends in Computational Biomedical Research

Md. Altaf-Ul-Amin, Shigehiko Kanaya, Naoaki Ono, Ming Huang

LIFE-BASEL (2022)

Article Biology

Discussion of Cuffless Blood Pressure Prediction Using Plethysmograph Based on a Longitudinal Experiment: Is the Individual Model Necessary?

Koshiro Kido, Zheng Chen, Ming Huang, Toshiyo Tamura, Wei Chen, Naoaki Ono, Masachika Takeuchi, Md. Altaf-Ul-Amin, Shigehiko Kanaya

Summary: This study proposes a method for estimating blood pressure using PPG signal and evaluates its accuracy and robustness through the comparison of different regression models. The results show that an individual Gaussian Process model achieves the best performance, outperforming the generalized model built with all subjects' data.

LIFE-BASEL (2022)

Article Biology

Exploring and Identifying Prognostic Phenotypes of Patients with Heart Failure Guided by Explainable Machine Learning

Xue Zhou, Keijiro Nakamura, Naohiko Sahara, Masako Asami, Yasutake Toyoda, Yoshinari Enomoto, Hidehiko Hara, Mahito Noro, Kaoru Sugi, Masao Moroi, Masato Nakamura, Ming Huang, Xin Zhu

Summary: This study utilized machine learning to identify three phenotypes of heart failure patients, stratifying them based on survival curves and mortality risk effectively. By training on the derivation dataset, these phenotypes were successfully applied to new patients in the validation dataset, with age and creatinine clearance rate identified as the top two most important predictors.

LIFE-BASEL (2022)

Article Engineering, Biomedical

Sleep postures monitoring based on capacitively coupled electrodes and deep recurrent neural networks

Shun Peng, Yang Li, Rui Cui, Ke Xu, Yonglin Wu, Ming Huang, Chenyun Dai, Toshiyo Tamur, Subhas Mukhopadhyay, Chen Chen, Wei Chen

Summary: This study investigates another potential application of cECG in sleep monitoring, specifically sleep posture recognition. By using a classifier model based on a deep recurrent neural network, accurate recognition of different sleep postures was achieved.

BIOMEDICAL ENGINEERING ONLINE (2022)

Article Infectious Diseases

Prediction of Potential Natural Antibiotics Plants Based on Jamu Formula Using Random Forest Classifier

Ahmad Kamal Nasution, Sony Hartono Wijaya, Pei Gao, Rumman Mahfujul Islam, Ming Huang, Naoaki Ono, Shigehiko Kanaya, Md Altaf-Ul-Amin

Summary: Jamu is a traditional Indonesian herbal medicine system that is considered to have many benefits. This study uses a machine learning approach to discover the potential of 14 plants as natural antibiotic candidates.

ANTIBIOTICS-BASEL (2022)

Article Biochemical Research Methods

Sleep Staging Framework with Physiologically Harmonized Sub-Networks

Zheng Chen, Ziwei Yang, Dong Wang, Xin Zhu, Naoaki Ono, M. D. Altaf-Ul-Amin, Shigehiko Kanaya, Ming Huang

Summary: Sleep screening is an important tool in healthcare and neuroscience research. Automatic sleep scoring using deep neural networks shows promising results, but lacks the medical criterion for consistent performance. This paper proposes a framework for sleep stage scoring that captures stage-specific features satisfying sleep medicine criteria. The framework includes feature extraction networks and an attention-based scoring decision network. The proposed method achieves competitive stage scoring performance, especially for Wake, N2, and N3 stages.

METHODS (2023)

Article Chemistry, Analytical

An Advanced Internet of Things System for Heatstroke Prevention with a Noninvasive Dual-Heat-Flux Thermometer

Toshiyo Tamura, Ming Huang, Takumi Yoshimura, Shinjiro Umezu, Toru Ogata

Summary: The study presents the design and prototype of an Internet of Things system for heatstroke prevention. It integrates physiological information, particularly deep body temperature (DBT), using the dual-heat-flux method. A dual-heat-flux thermometer was developed and evaluated for real-time DBT monitoring. Real-time readings are stored on a cloud platform and processed by a decision rule to alert users of heatstroke incidents.

SENSORS (2022)

Article Physiology

A training pipeline of an arrhythmia classifier for atrial fibrillation detection using Photoplethysmography signal

Sota Kudo, Zheng Chen, Xue Zhou, Leighton T. Izu, Ye Chen-Izu, Xin Zhu, Toshiyo Tamura, Shigehiko Kanaya, Ming Huang

Summary: Photoplethysmography (PPG) signal shows potential in atrial fibrillation (AF) detection due to its convenience and physiological similarity to electrocardiogram (ECG). This study proposes a multiple-class classification model for AF detection, taking into consideration individual differences and sub-types in PPG manifestation. The best combination of configurable components in the pipeline includes first-order difference of heartbeat sequence as input format, a 2-layer CNN-1-layer Transformer hybrid model as the learning model, and the whole model fine-tuning as the transfer learning scheme (F1 value: 0.80, overall accuracy: 0.87).

FRONTIERS IN PHYSIOLOGY (2023)

Article Medicine, General & Internal

Risk of Mortality Prediction Involving Time-Varying Covariates for Patients with Heart Failure Using Deep Learning

Keijiro Nakamura, Xue Zhou, Naohiko Sahara, Yasutake Toyoda, Yoshinari Enomoto, Hidehiko Hara, Mahito Noro, Kaoru Sugi, Ming Huang, Masao Moroi, Masato Nakamura, Xin Zhu

Summary: This study developed and validated a deep learning-based prognostic model to predict the risk of all-cause mortality for patients with HF. The proposed model showed better prediction performance in terms of discrimination, calibration, and risk stratification compared to other deep learning and traditional statistical models, especially in identifying high-risk patients.

DIAGNOSTICS (2022)

Article Engineering, Biomedical

Automated Sleep Staging via Parallel Frequency-Cut Attention

Zheng Chen, Ziwei Yang, Lingwei Zhu, Wei Chen, Toshiyo Tamura, Naoaki Ono, Md Altaf-Ul-Amin, Shigehiko Kanaya, Ming Huang

Summary: In this paper, a novel framework is proposed for automated sleep staging based on sleep medicine guidance. The framework captures time-frequency characteristics of sleep EEG signals and utilizes a Transformer model with an attention-based module for staging decisions. The method achieves state-of-the-art results and demonstrates high inter-rater reliability, with important implications for healthcare and neuroscience research.

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING (2023)

Proceedings Paper Engineering, Biomedical

Prediction of Potential Natural Antibiotics based on Jamu Formula Using Machine Learning Approach

Ahmad Kamal Nasution, Sony Hartono Wijaya, Ming Huang, Naoaki Ono, Shigehiko Kanaya, Md. Altaf Ul-Amin

Summary: This research used machine learning methods to classify Jamu formulas and predict their effectiveness against bacterial diseases. It identified 111 potential antibiotic compounds for various systems.

2022 IEEE 22ND INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2022) (2022)

Proceedings Paper Computer Science, Hardware & Architecture

A Stochastic Coding Method of EEG Signals for Sleep Stage Classification

Guangxian Zhu, Huijia Wang, Yirong Kan, Zheng Chen, Ming Huang, Md. Amin, Naoaki Ono, Shigehiko Kanaya, Renyuan Zhang, Yasuhiko Nakashima

Summary: This paper presents an innovative non-deterministic coding method for EEG signals and achieves competitive results in sleep stage classification tasks.

2022 IEEE 35TH INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (IEEE SOCC 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

An Approach to Construct and Validate TCM Dataset Effective against Bacterial Pneumonia

Pei Gao, Zheng Chen, Ming Huang, Naoaki Ono, Shigehiko Kanaya, Md Altaf-UI-Amin

Summary: The study utilizes empirical data from Traditional Chinese Medicine to develop new antibiotics, screening out 2258 potential TCM formulae for treating bacterial pneumonia. Evaluated by the random forest algorithm, the matching labeling performs significantly better than clustering labeling by K-means.

2021 IEEE 3RD GLOBAL CONFERENCE ON LIFE SCIENCES AND TECHNOLOGIES (IEEE LIFETECH 2021) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Prediction of Body Constitutions through Life-Style for Health Guidance

Guang Shi, Zhen Chen, Shigehiko Kanaya, Md Altaf-UI-Amin, Naoaki Ono, Ming Huang

Summary: In this study, machine learning algorithms are used to predict the body constitutions (BCs) of traditional Chinese medical theory. By identifying the principle features (PFs) of life-style, biased BCs are transformed into gentle constitutions to provide health guidance. The prediction accuracy is improved by 29% and the amount of identified PFs is reduced to 66.7% compared to previous works.

2021 IEEE 3RD GLOBAL CONFERENCE ON LIFE SCIENCES AND TECHNOLOGIES (IEEE LIFETECH 2021) (2021)

No Data Available