4.7 Article

iRice-MS: An integrated XGBoost model for detecting multitype post-translational modification sites in rice

期刊

BRIEFINGS IN BIOINFORMATICS
卷 23, 期 1, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbab486

关键词

PTM; XGBoost; feature integration; rice; web-server

资金

  1. National Natural Science Foundation of China [61772119, 81872957]
  2. Sichuan Provincial Science Fund for Distinguished Young Scholars [20JCQN0262]

向作者/读者索取更多资源

In this study, a comprehensive method called iRice-MS based on eXtreme Gradient Boosting (XGBoost) was developed to identify multiple post-translational modifications (PTMs) in rice. The method displayed excellent performance in cross-validation and independent dataset test, and showed superiority to existing tools in terms of AUC value.
Post-translational modification (PTM) refers to the covalent and enzymatic modification of proteins after protein biosynthesis, which orchestrates a variety of biological processes. Detecting PTM sites in proteome scale is one of the key steps to in-depth understanding their regulation mechanisms. In this study, we presented an integrated method based on eXtreme Gradient Boosting (XGBoost), called iRice-MS, to identify 2-hydroxyisobutyrylation, crotonylation, malonylation, ubiquitination, succinylation and acetylation in rice. For each PTM-specific model, we adopted eight feature encoding schemes, including sequence-based features, physicochemical property-based features and spatial mapping information-based features. The optimal feature set was identified from each encoding, and their respective models were established. Extensive experimental results show that iRice-MS always display excellent performance on 5-fold cross-validation and independent dataset test. In addition, our novel approach provides the superiority to other existing tools in terms of AUC value. Based on the proposed model, a web server named iRice-MS was established and is freely accessible at http://lingroup.cn/server/iRice-MS.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Information Systems

Towards a better prediction of subcellular location of long non-coding RNA

Zhao-Yue Zhang, Zi-Jie Sun, Yu-He Yang, Hao Lin

Summary: This study presents a support vector machine-based approach that incorporates mutual information algorithm and incremental feature selection strategy to improve the prediction performance of lncRNA subcellular localization.

FRONTIERS OF COMPUTER SCIENCE (2022)

Article Biochemical Research Methods

PSnoD: identifying potential snoRNA-disease associations based on bounded nuclear norm regularization

Zijie Sun, Qinlai Huang, Yuhe Yang, Shihao Li, Hao Lv, Yang Zhang, Hao Lin, Lin Ning

Summary: Many studies have shown the important roles of small nucleolar RNAs (snoRNAs) in the development of complex human diseases. However, traditional experimental approaches for uncovering associations between snoRNAs and diseases are costly and time-consuming. This study proposed a method called PSnoD, which achieved superior performance and computational efficiency in predicting snoRNA-disease associations.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biochemistry & Molecular Biology

Transcriptomics and Genomics Analysis Uncover the Differentially Expressed Chlorophyll and Carotenoid-Related Genes in Celery

Xiaoming Song, Nan Li, Yingchao Zhang, Yi Liang, Rong Zhou, Tong Yu, Shaoqin Shen, Shuyan Feng, Yu Zhang, Xiuqing Li, Hao Lin, Xiyin Wang

Summary: Through transcriptomics and genomics analysis, this study identified and analyzed the genes related to chlorophyll and carotenoid in celery and other species. The study found that transcription factors play a role in regulating the expression of these genes. Expansion of carotenoid-related genes was observed in celery, while no notable expansion was found in chlorophyll biosynthesis genes.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2022)

Article Biochemical Research Methods

iLoc-miRNA: extracellular/intracellular miRNA prediction using deep BiLSTM with attention mechanism

Zhao-Yue Zhang, Lin Ning, Xiucai Ye, Yu-He Yang, Yasunori Futamura, Tetsuya Sakurai, Hao Lin

Summary: The location of miRNAs in cells plays a crucial role in their regulatory function. Current prediction algorithms for miRNA subcellular localization have limitations. In this study, a new data partitioning strategy and deep learning algorithm were proposed to accurately predict miRNA subcellular localization and explore the underlying mechanisms through motif analysis. Additionally, a user-friendly web server was established for convenient use.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biology

Analysis and modeling of myopia-related factors based on questionnaire survey

Jianqiang Xiao, Mujiexin Liu, Qinlai Huang, Zijie Sun, Lin Ning, Junguo Duan, Siquan Zhu, Jian Huang, Hao Lin, Hui Yang

Summary: This study investigated the relationship between environmental, habits, parental vision, demographic factors and adolescent myopia by analyzing questionnaire data. Machine learning algorithms were used to classify the samples. The age variable and parental myopia status were found to be important risk factors, while measures taken by children and the distance between books and eyes during reading were identified as protective factors.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Article Biochemistry & Molecular Biology

A Statistical Analysis of the Sequence and Structure of Thermophilic and Non-Thermophilic Proteins

Zahoor Ahmed, Hasan Zulfiqar, Lixia Tang, Hao Lin

Summary: The study found that polar amino acids, short bond length, wide DHA angle, and aromatic amino acids play important roles in the thermostability of proteins through statistical analysis on pairs of thermophilic proteins and their non-thermophilic orthologous.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2022)

Review Biochemical Research Methods

A comprehensive review of bioinformatics tools for chromatin loop calling

Li Liu, Kaiyuan Han, Huimin Sun, Lu Han, Dong Gao, Qilemuge Xi, Lirong Zhang, Hao Lin

Summary: This review provides an overview of loop-calling tools for various 3C-based techniques. It categorizes and summarizes these tools, discusses background biases and denoising algorithms, and helps researchers select the most appropriate method for loop calling and downstream analysis. It is also useful for bioinformatics scientists aiming to develop new loop-calling algorithms.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemical Research Methods

Single-cell RNA-seq data analysis based on directed graph neural network

Xiang Feng, Hongqi Zhang, Hao Lin, Haixia Long

Summary: In this study, a directed graph neural network called scDGAE was developed for scRNA-seq analysis, using graph autoencoders and graph attention network. The experiment results showed that the scDGAE model achieved promising performance in gene imputation and cell clustering prediction, and it can be applied to general scRNA-Seq analyses.

METHODS (2023)

Article Chemistry, Medicinal

iPADD: A Computational Tool for Predicting Potential Antidiabetic Drugs Using Machine Learning Algorithms

Xiao-Wei Liu, Tian-Yu Shi, Dong Gao, Cai-Yi Ma, Hao Lin, Dan Yan, Ke-Jun Deng

Summary: Diabetes mellitus is a chronic metabolic disease that disrupts blood glucose homeostasis and leads to severe complications. The development of artificial intelligence has provided a powerful tool, iPADD, for accelerating the discovery of potential antidiabetic drugs. iPADD achieved high accuracy in drug prediction by using molecular fingerprints and machine learning algorithms.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2023)

Article Medicine, General & Internal

A First Computational Frame for Recognizing Heparin-Binding Protein

Wen Zhu, Shi-Shi Yuan, Jian Li, Cheng-Bing Huang, Hao Lin, Bo Liao

Summary: This study provides the first recognition framework for accurately identifying HBP based on machine learning. By using four sequence descriptors, HBP and non-HBP samples were represented by discrete numbers and input into SVM and RF algorithms for comparison. The SVM-based classifier was found to have the greatest potential for identifying HBP.

DIAGNOSTICS (2023)

Review Biochemistry & Molecular Biology

Empirical comparison and recent advances of computational prediction of hormone binding proteins using machine learning methods

Hasan Zulfiqar, Zhiling Guo, Bakanina Kissanga Grace-Mercure, Zhao-Yue Zhang, Hui Gao, Hao Lin, Yun Wu

Summary: Hormone binding proteins (HBPs) belong to soluble carrier proteins that interact selectively and non-covalently with hormones, promoting growth hormone signaling in humans and other animals. The identification of HBPs is crucial for understanding these proteins and their applications in medical and commercial fields. Computational prediction methods, using sequence information and machine learning algorithms, have played a significant role in recognizing HBPs, offering a time-saving and cost-effective alternative to experimental methods.

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL (2023)

Article Biodiversity Conservation

Urbanisation drives inter- and intraspecific variation in flight-related morphological traits of aquatic insects at different landscape scales

Wenfei Liao, Hao Lin

Summary: Urbanisation has complex effects on the morphological traits of aquatic insect species, with different species exhibiting different strategies and abilities to cope with movement barriers caused by urbanisation.

INSECT CONSERVATION AND DIVERSITY (2023)

Article Biology

MetaboliteCOVID: A manually curated database of metabolite markers for COVID-19

Liping Ren, Lin Ning, Yu Yang, Ting Yang, Xinyu Li, Shanshan Tan, Peixin Ge, Shun Li, Nanchao Luo, Pei Tao, Yang Zhang

Summary: Researchers have developed a manually curated database of metabolite markers related to COVID-19, which includes significantly altered metabolites associated with early diagnosis, disease severity, prognosis, and drug response. This database facilitates both basic and clinical research on COVID-19.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biology

Identification of Key DNA methylation sites related to differentially expressed genes in Lung squamous cell carcinoma

Jie Gao, Yongxian Feng, Yan Yang, Yuetong Shi, Junjie Liu, Hao Lin, Lirong Zhang

Summary: This study systematically identified and analyzed key CpG sites closely related to differential expression of genes in LUSC through a two-step correlation analysis method, and found that these sites and genes can serve as effective biomarkers for LUSC.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

暂无数据