Article
Biochemical Research Methods
Jeffrey Molendijk, Rui Yip, Benjamin L. Parker
Summary: We have developed a database for underrepresented post-translational modifications (PTMs) to accelerate the discovery of enriched protein modifications in experimental data. The database provides curated lists of proteins reported to be substrates of underrepresented modifications. We demonstrated the utility of the database through the analysis of previously published data. Additionally, we developed an online tool that integrates upstream transcription factor enrichment analysis with downstream pathway analysis through an easy-to-use interactive interface.
JOURNAL OF PROTEOME RESEARCH
(2022)
Article
Biochemical Research Methods
Jolene Ramsey, Brenley McIntosh, Daniel Renfro, Suzanne A. Aleksander, Sandra LaBonte, Curtis Ross, Adrienne E. Zweifel, Nathan Liles, Shabnam Farrar, Jason J. Gill, Ivan Erill, Sarah Ades, Tanya Z. Berardini, Jennifer A. Bennett, Siobhan Brady, Robert Britton, Seth Carbon, Steven M. Caruso, Dave Clements, Ritu Dalia, Meredith Defelice, Erin L. Doyle, Iddo Friedberg, Susan M. R. Gurney, Lee Hughes, Allison Johnson, Jason M. Kowalski, Donghui Li, Ruth C. Lovering, Tamara L. Mans, Fiona McCarthy, Sean D. Moore, Rebecca Murphy, Timothy D. Paustian, Sarah Perdue, Celeste N. Peterson, Birgit M. Pruss, Margaret S. Saha, Robert R. Sheehy, John T. Tansey, Louise Temple, Alexander William Thorman, Saul Trevino, Amy Cheng Vollmer, Virginia Walbot, Joanne Willey, Deborah A. Siegele, James C. Hu
Summary: The primary scientific literature provides information on gene function in a human-readable format, while Gene Ontology annotations capture this information in a machine-readable format. Manual annotations based on evidence directly from scientific literature improve data accessibility and provide novel insights into evolution across different organisms. The Community Assessment of Community Annotation with Ontologies (CACAO) project involved undergraduates in annotating scientific literature, contributing unique entries to public resources and expanding data value for research scientists worldwide.
PLOS COMPUTATIONAL BIOLOGY
(2021)
Article
Biochemical Research Methods
Pieter Verschaffelt, Tim Van den Bossche, Wassim Gabriel, Michal Burdukiewicz, Alessio Soggiu, Lennart Martens, Bernhard Y. Renard, Henning Schiebenhoefer, Bart Mesuere
Summary: The study of microbiomes has become increasingly important, leading to the emergence of tools like MegaGO, which calculates functional similarity between data sets using semantic similarity between Gene Ontology terms. MegaGO is user-friendly, high-performing, and available as a web application or standalone command line tool.
JOURNAL OF PROTEOME RESEARCH
(2021)
Article
Genetics & Heredity
Saadullah H. Ahmed, Alexander T. Deng, Rachael P. Huntley, Nancy H. Campbell, Ruth C. Lovering
Summary: The project focuses on improving the understanding of heart valve development by providing GO annotations for key proteins involved. These annotations will benefit the interpretation of a wide range of cardiovascular datasets.
FRONTIERS IN GENETICS
(2023)
Article
Biology
Sarah Wooller, Aikaterini Anagnostopoulou, Benno Kuropka, Michael Crossley, Paul R. Benjamin, Frances Pearl, Ildiko Kemenes, Gyoergy Kemenes, Murat Eravci
Summary: Key technologies in biomedical research, such as qRT-PCR and LC-MS-based proteomics, have led to the generation of large biological datasets that can be used for biomarker identification and quantification. However, there is a lack of protein sequence information for certain invertebrates. In this study, a comprehensive central nervous system (CNS) proteomics database was designed and benchmarked for the identification of proteins from the CNS of Lymnaea stagnalis. This database provides a valuable tool for quantitative proteomics analysis of protein interactomes involved in various CNS functions.
JOURNAL OF EXPERIMENTAL BIOLOGY
(2022)
Article
Biochemical Research Methods
Weiliang Huang, Maureen A. Kane
Summary: Metaproteomics by mass spectrometry is a powerful method for analyzing proteins in complex samples, providing insights into the functional composition of microbiota. Human gastrointestinal microbiota plays important roles in human health, and metaproteomics reveals novel associations between microbiota and diseases. The MAPLE microbiome analysis pipeline offers a user-friendly solution for optimal proteome inference and comprehensive comparison of microbiota composition.
JOURNAL OF PROTEOME RESEARCH
(2021)
Article
Immunology
Philip C. Huang, Rohit Goru, Anthony Huffman, Asiyah Yu Lin, Michael Cooke, Yongqun He
Summary: During the COVID-19 pandemic, the development of SARS-CoV-2 vaccines has led to the emergence of COVID-19 vaccine data. Cov19VaxKB is a knowledge-focused COVID-19 vaccine database aimed at supporting comprehensive annotation, integration, and analysis of COVID-19 vaccine information. It provides extensive lists of vaccines, vaccine formulations, clinical trials, publications, news articles, and vaccine adverse event case reports. Additionally, it includes vaccine design and statistical analysis tools for predicting vaccine targets and identifying enriched adverse events. The data integration and analytical features of Cov19VaxKB can facilitate vaccine research and development while serving as a useful reference for the public.
Article
Biotechnology & Applied Microbiology
Lei Chen, ZhanDong Li, ShiQi Zhang, Yu-Hang Zhang, Tao Huang, Yu-Dong Cai
Summary: Methylation is a common and important modification in biological systems, and recent studies have found that methylation is widely present in different RNA molecules. Computational prediction methods may serve as an alternative to detect all methylation sites.
BIOMED RESEARCH INTERNATIONAL
(2022)
Article
Biology
Shijian Ding, Deling Wang, Xianchao Zhou, Lei Chen, Kaiyan Feng, Xianling Xu, Tao Huang, Zhandong Li, Yudong Cai
Summary: This study used multiple machine learning methods to analyze single-cell profiles of the heart and identify the best features and classifiers for different heart cell types. The results showed that the decision tree and random forest classification models achieved the highest weighted F1 scores. The selected features and classification rules played a crucial role in cardiac structure and function, particularly certain long non-coding RNAs were found to be important for recognizing different cardiac cell types. These findings provide a solid academic foundation for the development of molecular diagnostics and biomarker discovery for cardiac diseases.
Article
Multidisciplinary Sciences
Zhan Dong Li, Xiangtian Yu, Zi Mei, Tao Zeng, Lei Chen, Xian Ling Xu, Hao Li, Tao Huang, Yu-Dong Cai
Summary: The mammary gland is an essential organ in mammals that produces milk for offspring. This study investigates the mechanisms underlying the differentiation of mammary progenitors into different cell subtypes using single-cell sequencing data. The findings identify specific gene features and rules that can classify epithelial cells into different subtypes and stages.
Article
Biotechnology & Applied Microbiology
FeiMing Huang, Lei Chen, Wei Guo, Tao Huang, Yu-dong Cai
Summary: The study constructs efficient classifiers based on single-cell RNA sequencing data and identifies essential gene biomarkers, while also mining a series of classification rules that can distinguish different cell cycle phases, providing a novel method for determining the cell cycle and identifying new potential cell cycle-related genes.
BIOMED RESEARCH INTERNATIONAL
(2022)
Article
Biotechnology & Applied Microbiology
Yu-Hang Zhang, ShiJian Ding, Lei Chen, Tao Huang, Yu-Dong Cai
Summary: This study developed a predictive model for subcellular localization by using protein-protein interaction networks, functional enrichment analysis, and proteins with confirmed localization. Various machine learning algorithms and feature selection methods were utilized to identify key features and understand their biological functions.
BIOMED RESEARCH INTERNATIONAL
(2022)
Article
Biotechnology & Applied Microbiology
FeiMing Huang, QingLan Ma, JingXin Ren, JiaRui Li, Fen Wang, Tao Huang, Yu-Dong Cai
Summary: Long-term cigarette smoking is associated with various human diseases, and this study used advanced machine learning methods to identify specific isoforms and pathways that play important roles in distinguishing smokers from former smokers. The study evaluated multiple feature selection algorithms and utilized a decision tree approach to establish high-performance classification models. The identified isoforms and classification rules were validated through previous research. The results highlight the relevance of isoforms such as ENST00000464835, ENST00000622663, and ENST00000284311, as well as pathways related to smoking response.
BIOMED RESEARCH INTERNATIONAL
(2023)
Article
Biotechnology & Applied Microbiology
Jingxin Ren, XianChao Zhou, Wei Guo, KaiYan Feng, Tao Huang, Yu-Dong Cai
Summary: Sarcoma, a common type of solid tumor in children and adolescents, has multiple subtypes that are often difficult to diagnose early, resulting in severe consequences. This study aimed to find potential biomarkers at the DNA methylation level to distinguish different sarcoma subtypes. Machine learning and feature ranking methods were used to analyze sarcoma samples and construct classification models. The specific expression of genes related to highly correlated methylation sites was proven to be associated with sarcoma, and decision tree algorithm helped to understand the differences between sarcoma types and classify subtypes.
BIOMED RESEARCH INTERNATIONAL
(2022)
Article
Biochemistry & Molecular Biology
Xiaohong Li, Xianchao Zhou, Shijian Ding, Lei Chen, Kaiyan Feng, Hao Li, Tao Huang, Yu-Dong Cai
Summary: In this study, machine learning methods were used to identify biomarkers that can accurately classify COVID-19 in different disease states and severities. The findings provide a new point of reference for understanding the disease's etiology and facilitating precise therapy.
Article
Biology
Jingxin Ren, Wei Guo, Kaiyan Feng, Tao Huang, Yudong Cai
Summary: In this study, the blood expression profiles of miRNA were analyzed to identify potential markers for differentiating the severity of COVID-19. The researchers constructed a high-precision RF model and extracted classification rules to quantify the role of miRNA expression in distinguishing COVID-19 patients with different severities.
Article
Biology
Jingxin Ren, Yuhang Zhang, Wei Guo, Kaiyan Feng, Ye Yuan, Tao Huang, Yu-Dong Cai
Summary: COVID-19 can cause impairment of smell and taste, and this study used machine learning to analyze gene expression levels in COVID-19 patient samples to identify important biomarkers associated with this loss of sensory ability. The study suggests potential mechanisms for COVID-19 complications and provides biomarkers for predicting olfactory and gustatory impairment.
Article
Biology
Yaochen Xu, Qinglan Ma, Jingxin Ren, Lei Chen, Wei Guo, Kaiyan Feng, Zhenbing Zeng, Tao Huang, Yudong Cai
Summary: COVID-19 not only damages the respiratory system, but also puts strain on the cardiovascular system. This study analyzed the gene expression levels of vascular endothelial cells and cardiomyocytes in COVID-19 patients and healthy controls using a machine learning-based workflow. The findings suggest that COVID-19 affects the gene expression levels in cardiac cells, providing insights into the pathogenesis of COVID-19 and potential therapeutic targets.
Article
Biology
Qinglan Ma, FeiMing Huang, Wei Guo, KaiYan Feng, Tao Huang, Yudong Cai
Summary: Phase-separation proteins (PSPs) play a role in liquid-liquid phase separation and have implications for cellular biology and disease development. Identifying PSPs and their functions can provide valuable insights.
Article
Biology
Qing-Lan Ma, Fei-Ming Huang, Wei Guo, Kai-Yan Feng, Tao Huang, Yu-Dong Cai
Summary: Vaccines elicit an immune response involving B and T cells, with B cells producing antibodies. The immunity to SARS-CoV-2 diminishes over time after vaccination. This study aimed to identify important changes in antigen-reactive antibodies post-vaccination to enhance vaccine efficacy.
Article
Biology
Jing-Xin Ren, Qian Gao, Xiao-Chao Zhou, Lei Chen, Wei Guo, Kai-Yan Feng, Lin Lu, Tao Huang, Yu-Dong Cai
Summary: A machine-learning-based method was used to analyze the scRNA-seq data of B cells, T cells, and myeloid cells from patients with COVID-19. Key genes related to SARS-CoV-2 infection were identified. The study revealed the dynamic changes in the immune system of COVID-19 patients at different stages, providing valuable insights into the ongoing effect of COVID-19 development on the immune system.
Article
Biology
Yong Yang, Yuhang Zhang, Jingxin Ren, Kaiyan Feng, Zhandong Li, Tao Huang, Yudong Cai
Summary: This study analyzed single-cell RNA sequencing data from a normal colon to identify genetic markers of 25 immune cell types and reveal quantitative differences between them. Machine learning-based methods were used to analyze the importance of gene features and classify the most important genetic markers. The results provide a reference for exploring the cell composition of the colon cancer microenvironment and clinical immunotherapy.