☆ 4.7 Article

Single-cell RNA-seq interpretations using evolutionary multiobjective ensemble pruning

BIOINFORMATICS (2019)

Journal

BIOINFORMATICS

Volume 35, Issue 16, Pages 2809-2817

Publisher

OXFORD UNIV PRESS

DOI: 10.1093/bioinformatics/bty1056

Keywords

Funding

Research Grants Council of the Hong Kong Special Administrative Region [CityU 21200816, CityU 11203217, CityU 11200218]
National Natural Science Foundation of China [61603087]
Natural Science Foundation of Jilin Province [20190103006JH]
Fundamental Research Funds for the Central Universities [2412017FZ026]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Motivation: In recent years, single-cell RNA sequencing enables us to discover cell types or even sub-types. Its increasing availability provides opportunities to identify cell populations from single-cell RNA-seq data. Computational methods have been employed to reveal the gene expression variations among multiple cell populations. Unfortunately, the existing ones can suffer from realistic restrictions such as experimental noises, numerical instability, high dimensionality and computational scalability. Results: We propose an evolutionary multiobjective ensemble pruning algorithm (EMEP) that addresses those realistic restrictions. Our EMEP algorithm first applies the unsupervised dimensionality reduction to project data from the original high dimensions to low-dimensional subspaces; basic clustering algorithms are applied in those new subspaces to generate different clustering results to form cluster ensembles. However, most of those cluster ensembles are unnecessarily bulky with the expense of extra time costs and memory consumption. To overcome that problem, EMEP is designed to dynamically select the suitable clustering results from the ensembles. Moreover, to guide the multiobjective ensemble evolution, three cluster validity indices including the overall cluster deviation, the within-cluster compactness and the number of basic partition clusters are formulated as the objective functions to unleash its cell type discovery performance using evolutionary multiobjective optimization. We applied EMEP to 55 simulated datasets and seven real single-cell RNA-seq datasets, including six single-cell RNA-seq dataset and one large-scale dataset with 3005 cells and 4412 genes. Two case studies are also conducted to reveal mechanistic insights into the biological relevance of EMEP. We found that EMEP can achieve superior performance over the other clustering algorithms, demonstrating that EMEP can identify cell populations clearly.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

scESI: evolutionary sparse imputation for single-cell transcriptomes from nearest neighbor cells

Qiaoming Liu, Ximei Luo, Jie Li, Guohua Wang

Summary: In this study, the authors propose an evolutionary sparse imputation (ESI) algorithm for single-cell transcriptomes to address the dropout problem and reduce data noise in gene expression profiles. The ESI algorithm constructs a sparse representation model based on gene regulation relationships between cells and uses an optimization framework based on nondominated sorting genetics to iteratively search for the global optimal solution. The results show that scESI outperforms benchmark methods in simulated datasets and real scRNA-seq datasets, improving cell classification, trajectory reconstruction, and identification of differentially expressed genes.

BRIEFINGS IN BIOINFORMATICS (2022)