☆ 4.7 Article

Imputing single-cell RNA-seq data by combining graph convolution and autoencoder neural networks

ISCIENCE (2021)

期刊

ISCIENCE

卷 24, 期 5, 页码 -

出版社

CELL PRESS

DOI: 10.1016/j.isci.2021.102393

关键词

-

类别

Multidisciplinary Sciences

资金

National Key R&D Program of China [2020YFB0204803]
National Natural Science Foundation of China [61772566]
Guangdong Key Field RD Plan [2019B020228001, 2018B010109006]
Introducing Innovative and Entrepreneurial Teams [2016ZT06D211]
Guangzhou ST Research Plan [202007030010]

向作者/读者索取更多资源

Protocol

Reagent

智能总结 New
摘要

Single-cell RNA sequencing technology enables analysis of single-cell transcriptomes with unprecedented throughput and resolution, but faces the challenge of dropout problem. The developed method GraphSCI, based on graph convolution networks, outperforms other state-of-the-art methods in imputation, accurately inferring gene-to-gene relationships and providing powerful assistance during training.

Single-cell RNA sequencing technology promotes the profiling of single-cell transcriptomes at an unprecedented throughput and resolution. However, in scRNA-seq studies, only a low amount of sequenced mRNA in each cell leads to missing detection for a portion of mRNA molecules, i.e. the dropout problem which hinders various downstream analyses. Therefore, it is necessary to develop robust and effective imputation methods for the increasing scRNA-seq data. In this study, we have developed an imputation method (GraphSCI) to impute the dropout events in scRNA-seq data based on the graph convolution networks. Extensive experiments demonstrated that GraphSCI outperforms other state-of-the-art methods for imputation on both simulated and real scRNA-seq data. Meanwhile, GraphSCI is able to accurately infer gene-to-gene relationships and the inferred gene-to-gene relationships could also provide powerful assistance for imputation dynamically during the training process, which is a key promotion of GraphSCI compared with other imputation algorithms.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Biochemistry & Molecular Biology

Exploring the optimization of autoencoder design for imputing single-cell RNA sequencing data

Nan Miles Xi, Jingyi Jessica Li

Summary: In this study, we empirically examined the impact of neural network architecture, activation function, and regularization strategy on imputation accuracy in scRNA-seq data. Our results show that deeper and narrower autoencoders perform better, sigmoid and tanh activation functions outperform ReLU, and regularization improves imputation accuracy and downstream analyses.

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL (2023)

添加到收藏夹

Article Biochemical Research Methods

Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network

Yanglan Gan, Xingyu Huang, Guobing Zou, Shuigeng Zhou, Jihong Guan

Summary: Single-cell RNA sequencing is a critical technique for studying cell heterogeneity and diversity. However, clustering analysis of scRNA-seq data is challenging due to noise, high dimensionality, and dropout events. In this study, a new deep structural clustering method called scDSC is proposed, which incorporates structural information to improve clustering accuracy and scalability.

BRIEFINGS IN BIOINFORMATICS (2022)

添加到收藏夹

Article Multidisciplinary Sciences

scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses

Juexin Wang, Anjun Ma, Yuzhou Chang, Jianting Gong, Yuexu Jiang, Ren Qi, Cankun Wang, Hongjun Fu, Qin Ma, Dong Xu

Summary: Single-cell RNA-Seq faces challenges such as sparsity in sequencing and complex patterns in gene expression. The introduction of a graph neural network based on a hypothesis-free deep learning framework provides an effective representation of gene expression and cell-cell relationships.

NATURE COMMUNICATIONS (2021)

添加到收藏夹

Article Genetics & Heredity

Single-cell RNA-seq data analysis using graph autoencoders and graph attention networks

Xiang Feng, Fang Fang, Haixia Long, Rao Zeng, Yuhua Yao

Summary: The importance of gene imputation and cell clustering analysis of single-cell RNA sequencing (scRNA-seq) data has increased with the development of high-throughput sequencing technology. The scGAEGAT model, based on graph neural networks, demonstrated promising performance in gene imputation and cell clustering prediction on four scRNA-seq data sets.

FRONTIERS IN GENETICS (2022)

添加到收藏夹

Article Biochemical Research Methods

Bubble: a fast single-cell RNA-seq imputation using an autoencoder constrained by bulk RNA-seq data

Siqi Chen, Xuhua Yan, Ruiqing Zheng, Min Li

Summary: Single-cell RNA sequencing technology (scRNA-seq) has the drawback of large sparsity, which leads to dropout events and affects downstream analyses. To address this, we propose Bubble, which identifies and imputes dropout events using expression rate and coefficient of variation, and leverages bulk RNA-seq data as a constraint. Bubble improves recovery of missing values, correlations, and reduces false positive signals. It enhances differential expression analysis, clustering, visualization, and aids cellular trajectory inference. Moreover, Bubble provides fast and scalable imputation with minimal memory usage.

BRIEFINGS IN BIOINFORMATICS (2023)

添加到收藏夹

Article Biochemical Research Methods

Bubble: a fast single-cell RNA-seq imputation using an autoencoder constrained by bulk RNA-seq data

Siqi Chen, Xuhua Yan, Ruiqing Zheng, Min Li

Summary: Bubble is a method for identifying and imputing 'dropout events' in scRNA-seq data, using gene expression rate and coefficient of variation to identify zeros, and then utilizing an autoencoder for imputation. Bubble enhances the recovery of missing values, reduces the introduction of false positive signals, and improves the identification of differentially expressed genes and cell clustering and visualization.

BRIEFINGS IN BIOINFORMATICS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Temporal network embedding using graph attention network

Anuraj Mohan, K. Pramod

Summary: The Temporal Graph Attention Network (TempGAN) aims to learn representations from continuous-time temporal networks by preserving the temporal proximity between nodes. Generating a Positive Pointwise Mutual Information matrix (PPMI) through temporal walks on the network, TempGAN architecture uses both adjacency and PPMI information to generate node embeddings. Link prediction experiments using TempGAN autoencoder are conducted to evaluate the quality of the embeddings generated and compare them with other state-of-the-art methods.

COMPLEX & INTELLIGENT SYSTEMS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

TraverseNet: Unifying Space and Time in Message Passing for Traffic Forecasting

Zonghan Wu, Da Zheng, Shirui Pan, Quan Gan, Guodong Long, George Karypis

Summary: This article introduces a novel spatial-temporal graph neural network called TraverseNet for capturing the spatial-temporal dependencies in traffic data. Compared to other spatial-temporal neural networks, TraverseNet views space and time as an inseparable whole and utilizes message traverse mechanisms to explore the dependencies in the spatial-temporal graph.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

添加到收藏夹

Article Biochemical Research Methods

DSAE-Impute: Learning Discriminative Stacked Autoencoders for Imputing Single-cell RNA-seq Data

Shengfeng Gan, Huan Deng, Yang Qiu, Mohammed Alshahrani, Shichao Liu

Summary: This research proposes an accurate deep learning method called DSAE-Impute to impute the missing values in scRNA-seq data. The method employs stacked autoencoders and discriminative cell similarity to capture global expression features and achieve accurate imputation. Experimental results demonstrate its superiority in downstream analysis.

CURRENT BIOINFORMATICS (2022)

添加到收藏夹

Article Biochemical Research Methods

Single-cell RNA-seq data analysis based on directed graph neural network

Xiang Feng, Hongqi Zhang, Hao Lin, Haixia Long

Summary: In this study, a directed graph neural network called scDGAE was developed for scRNA-seq analysis, using graph autoencoders and graph attention network. The experiment results showed that the scDGAE model achieved promising performance in gene imputation and cell clustering prediction, and it can be applied to general scRNA-Seq analyses.

METHODS (2023)

添加到收藏夹

Article Biochemical Research Methods

SCDD: a novel single-cell RNA-seq imputation method with diffusion and denoising

Jian Liu, Yichen Pan, Zhihan Ruan, Jun Guo

Summary: In this paper, we propose a novel two-stage diffusion-denoising method called SCDD for large-scale single-cell RNA-seq imputation. The method effectively suppresses the over-smooth problem and remarkably improves the downstream analysis of single-cell RNA-seq, including clustering and trajectory analysis.

BRIEFINGS IN BIOINFORMATICS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A2AE: Towards adaptive multi-view graph representation learning via all-to-all graph autoencoder architecture

Dengdi Sun, Dashuang Li, Zhuanlian Ding, Xingyi Zhang, Jin Tang

Summary: This paper proposes a novel all-to-all graph autoencoder model, named A2AE, for multi-view graph representation learning. It utilizes the rich relational information in multiple views and recognizes the importance of different views.

APPLIED SOFT COMPUTING (2022)

添加到收藏夹

Article Biochemical Research Methods

ScCAEs: deep clustering of single-cell RNA-seq via convolutional autoencoder embedding and soft K-means

Hang Hu, Zhong Li, Xiangjie Li, Minzhe Yu, Xiutao Pan

Summary: This study proposes a novel deep embedding clustering method for single-cell RNA-seq data, which integrates deep learning and convolutional autoencoder for feature representation and utilizes a regularized soft K-means algorithm for clustering. Experimental results demonstrate that this method outperforms other approaches in various datasets and exhibits good compatibility and robustness.

BRIEFINGS IN BIOINFORMATICS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Dual-decoder graph autoencoder for unsupervised graph representation learning

Dengdi Sun, Dashuang Li, Zhuanlian Ding, Xingyi Zhang, Jin Tang

Summary: The study introduces a dual-decoder graph autoencoder model that effectively embeds the topological structure and node attributes of a graph into a compact representation, showcasing superior performance in experiments.

KNOWLEDGE-BASED SYSTEMS (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

GLASS: A Graph Laplacian Autoencoder with Subspace Clustering Regularization for Graph Clustering

Dengdi Sun, Liang Liu, Bin Luo, Zhuanlian Ding

Summary: This paper proposes a novel graph Laplacian autoencoder with subspace clustering regularization for graph clustering (GLASS). The method overcomes the entanglement between convolutional filters and weight matrices in GCN encoders by using Laplacian smoothing filters and MLPs. The GLASS approach improves the feature propagation capability and clustering performance through residual connections and subspace clustering regularization. Experimental results demonstrate the effectiveness of GLASS and its advantages over GCN encoders in graph clustering and image clustering.

COGNITIVE COMPUTATION (2023)

添加到收藏夹

Article Biochemical Research Methods

AlphaFold2-aware protein-DNA binding site prediction using graph transformer

Qianmu Yuan, Sheng Chen, Jiahua Rao, Shuangjia Zheng, Huiying Zhao, Yuedong Yang

Summary: In this study, a precise predictor called GraphSite based on AlphaFold2 is proposed for identifying DNA-binding residues from protein structural models. By employing a graph transformer and leveraging predicted protein structures, GraphSite significantly improves the accuracy of DNA binding site prediction.

BRIEFINGS IN BIOINFORMATICS (2022)

添加到收藏夹

Article Virology

Mendelian randomization suggests a potential causal effect of eosinophil count on influenza vaccination responsiveness

Hongwei Chen, Haoyang Zhang, Simin Wen, Xuehao Xiu, Danming You, Huiying Zhao, Dayan Wang, Yuedong Yang, Yuelong Shu

Summary: Currently, there is a lack of systematic exploration on the clinical factors influencing immune responses to influenza vaccines. The mechanism of low responsiveness to influenza vaccination (LRIV) is complex and not well understood. In this study, we combined our in-house genome-wide association studies (GWAS) analysis of LRIV with the GWAS summary of 10 blood-based biomarkers to investigate the genetics shared between LRIV and blood-based biomarkers using Mendelian randomization (MR). The results suggest a potential causal relationship between genetically instrumented LRIV and decreased eosinophil count.

JOURNAL OF MEDICAL VIROLOGY (2023)

添加到收藏夹

Editorial Material Computer Science, Artificial Intelligence

Integrating supercomputing and artificial intelligence for life science

Jiahua Rao, Shuangjia Zheng, Yuedong Yang

PATTERNS (2022)

添加到收藏夹

Article Biochemical Research Methods

Identifying spatial domain by adapting transcriptomics with histology through contrastive learning

Yuansong Zeng, Rui Yin, Mai Luo, Jianing Chen, Zixiang Pan, Yutong Lu, Weijiang Yu, Yuedong Yang

Summary: Recent advances in spatial transcriptomics have allowed for gene expression measurement at cell/spot resolution, while retaining spatial information and histology images of the tissues. Accurately identifying the spatial domains of spots is crucial for downstream tasks in spatial transcriptomics analysis. In this study, a novel method called ConGI is proposed, which utilizes contrastive learning to accurately exploit spatial domains by combining gene expression with histopathological images. The method outperforms existing methods and the learned representations are useful for various downstream tasks.

BRIEFINGS IN BIOINFORMATICS (2023)

添加到收藏夹

Article Biochemical Research Methods

Fast and accurate protein intrinsic disorder prediction by using a pretrained language model

Yidong Song, Qianmu Yuan, Sheng Chen, Ken Chen, Yaoqi Zhou, Yuedong Yang

Summary: Determining intrinsically disordered regions of proteins is crucial for understanding protein biological functions and associated diseases. This study proposes a fast and accurate protein disorder predictor, LMDisorder, which utilizes embedding generated by unsupervised pretrained language models as features. LMDisorder outperforms other single-sequence-based methods and compares favorably to another language-model-based technique in independent test sets. Additionally, LMDisorder shows equivalent or better performance than the state-of-the-art profile-based technique SPOT-Disorder2. The high computation efficiency of LMDisorder allows for proteome-scale analysis, revealing associations between proteins with high predicted disorder content and specific biological functions. The datasets, source codes, and trained model are available at https://github.com/biomed-AI/LMDisorder.

BRIEFINGS IN BIOINFORMATICS (2023)

添加到收藏夹

Article Biochemical Research Methods

Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion

Qianmu Yuan, Junjie Xie, Jiancong Xie, Huiying Zhao, Yuedong Yang

Summary: Protein function prediction is crucial in bioinformatics and has implications for disease mechanism elucidation and drug target discovery. However, accurately predicting protein functions solely from sequences remains challenging. This study introduces SPROF-GO, a sequence-based alignment-free predictor that utilizes a pretrained language model to extract informative sequence embeddings and implements self-attention pooling to focus on important residues. SPROF-GO outperforms state-of-the-art approaches in precision-recall curves and demonstrates generalization capabilities.

BRIEFINGS IN BIOINFORMATICS (2023)

添加到收藏夹

Article Biochemical Research Methods

A Drug Combination Prediction Framework Based on Graph Convolutional Network and Heterogeneous Information

Hegang Chen, Yuyin Lu, Yuedong Yang, Yanghui Rao

Summary: Combination therapy plays an important role in treating complex diseases, but the large number of possible combinations limits our ability to identify effective ones. This study introduces a new computational pipeline, DCMGCN, which integrates diverse drug-related information to predict novel drug combinations. The tests show that DCMGCN outperforms existing methods and may help to clarify the understanding of drug mechanisms.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

添加到收藏夹

Article Engineering, Biomedical

ShockSurv: A machine learning model to accurately predict 28-day mortality for septic shock patients in the intensive care unit

Fudan Zheng, Luhao Wang, Yuxian Pang, Zhiguang Chen, Yutong Lu, Yuedong Yang, Jianfeng Wu

Summary: Septic shock has become the leading cause of morbidity and mortality in the ICU. However, currently there is no model to predict the mortality of septic shock patients. We aim to develop such a model.

BIOMEDICAL SIGNAL PROCESSING AND CONTROL (2023)

添加到收藏夹

Article Neurosciences

Inferring the genetic relationship between brain imaging-derived phenotypes and risk of complex diseases by Mendelian randomization and genome-wide colocalization

Siying Lin, Haoyang Zhang, Mengling Qi, David N. Cooper, Yuedong Yang, Yuanhao Yang, Huiying Zhao

Summary: Observational studies consistently show that brain imaging-derived phenotypes (IDPs) are critical markers for the early diagnosis of brain disorders and cardiovascular diseases. However, the shared genetic landscape between brain IDPs and the risk of these diseases remains unclear, limiting the application of potential diagnostic techniques using brain IDPs.

NEUROIMAGE (2023)

添加到收藏夹

Article Biochemistry & Molecular Biology

Characterizing RNA-binding ligands on structures, chemical information, binding affinity and drug-likeness

Cong Fan, Xin Wang, Tianze Ling, Yuedong Yang, Huiying Zhao

Summary: Recent studies suggest that RNAs have potential as drug targets, but progress in detecting RNA-ligand interactions is limited. To guide the discovery of RNA-binding ligands, it is necessary to comprehensively characterize them in terms of binding specificity, binding affinity, and drug-like properties. We established the RNALID database, which contains 358 validated RNA-ligand interactions. Comparisons with other databases show that the majority of ligands in RNALID are novel, and the analysis of ligand structure, binding affinity, and cheminformatic parameters reveals insights into the characteristics of different ligand types. Additionally, comparing RNALID ligands to FDA-approved drugs and ligands without bioactivity sheds light on their differences in chemical properties and drug-likeness.

RNA BIOLOGY (2023)

添加到收藏夹

Article Biochemical Research Methods

EV1ncRNA-Dpred: improved prediction of experimentally validated lncRNAs by deep learning

Bailing Zhou, Maolin Ding, Jing Feng, Baohua Ji, Pingping Huang, Junye Zhang, Xue Yu, Zanxia Cao, Yuedong Yang, Yaoqi Zhou, Jihua Wang

Summary: Long non-coding RNAs (lncRNAs) are important in biological processes and disease. Algorithms have been developed to distinguish lncRNAs from mRNAs, resulting in the discovery of over 600,000 lncRNAs. However, only a small fraction of these have been validated through low-throughput experiments. To prioritize potentially functional lncRNAs and overcome the challenge of small datasets, deep learning algorithms were employed in this study.

BRIEFINGS IN BIOINFORMATICS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Subgraph-Aware Few-Shot Inductive Link Prediction Via Meta-Learning

Shuangjia Zheng, Sijie Mai, Ya Sun, Haifeng Hu, Yuedong Yang

Summary: Link prediction for knowledge graphs aims to predict missing connections between entities. Prevailing methods are limited to a transductive setting and hard to process unseen entities. The recently proposed subgraph-based models provide alternatives to predict links from the subgraph structure surrounding a candidate triplet.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2023)

添加到收藏夹

Article Biochemical Research Methods

Identifying B-cell epitopes using AlphaFold2 predicted structures and pretrained language model

Yuansong Zeng, Zhuoyi Wei, Qianmu Yuan, Sheng Chen, Weijiang Yu, Yutong Lu, Jianzhao Gao, Yuedong Yang

Summary: Drawing on the breakthrough of AlphaFold2 in protein structure prediction, we propose a novel graph-based model, GraphBepi, for accurate B-cell epitope prediction. By utilizing the predicted structure from AlphaFold2, GraphBepi constructs the protein graph and captures both sequence and spatial information through edge-enhanced deep graph neural networks (EGNN) and bidirectional long short-term memory neural networks (BiLSTM). The combined representations are input into a multilayer perceptron to predict B-cell epitopes. Comprehensive tests demonstrate that GraphBepi outperforms state-of-the-art methods in terms of AUC and AUPR.

BIOINFORMATICS (2023)

添加到收藏夹

Article Biochemistry & Molecular Biology

Biological informed graph neural network for tumor mutation burden prediction and immunotherapy-related pathway analysis in gastric cancer

Chuwei Liu, Arabella H. Wan, Heng Liang, Lei Sun, Jiarui Li, Ranran Yang, Qinghai Li, Ruibo Wu, Kunhua Hu, Yuedong Yang, Shirong Cai, Guohui Wan, Weiling He

Summary: Tumor mutation burden (TMB) is an important biomarker for assessing the efficacy of cancer immunotherapy, but its correlation with immune checkpoint inhibitors (ICIs) responsiveness varies among different cancer types. This study explores the relationship between TMB and multi-omics data in various cancer types and develops the PGLCN model to improve the interpretability and prediction accuracy of TMB. By integrating multi-omics data, the PGLCN model outperforms traditional machine learning methods in predicting TMB status and identifies potential combined biomarkers for TMB in gastric cancer.

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL (2023)

添加到收藏夹

Article Radiology, Nuclear Medicine & Medical Imaging

Machine learning on MRI radiomic features: identification of molecular subtype alteration in breast cancer after neoadjuvant therapy

Hai-Qing Liu, Si-Ying Lin, Yi-Dong Song, Si-Yao Mai, Yue-Dong Yang, Kai Chen, Zhuo Wu, Hui-Ying Zhao

Summary: This study developed a machine learning model based on MRI to predict molecular subtype alterations in breast cancer after neoadjuvant therapy. The model showed favorable predictive efficacy in identifying molecular subtype alteration and could be a useful tool in clinical practice.

EUROPEAN RADIOLOGY (2023)

添加到收藏夹

暂无数据

© Peeref 2019-2024. All rights reserved.