☆ 4.6 Article

Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures

PLOS ONE (2013)

Journal

PLOS ONE

Volume 8, Issue 11, Pages -

Publisher

PUBLIC LIBRARY SCIENCE

DOI: 10.1371/journal.pone.0079380

Keywords

Funding

European Regional Development Fond (ERDF) of the European Union
Ministerium fur Innovation, Wissenschaft und Forschung des Landes Nordrhein-Westfalen (ParkCHIP) [FZ 280381102]
P.U.R.E. (Protein Unit for Research in Europe), a project of Nordrhein-Westfalen, a federal state of Germany
Collaborative Research Center [SFB 876]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Detection of yet unknown subgroups showing differential gene or protein expression is a frequent goal in the analysis of modern molecular data. Applications range from cancer biology over developmental biology to toxicology. Often a control and an experimental group are compared, and subgroups can be characterized by differential expression for only a subgroup-specific set of genes or proteins. Finding such genes and corresponding patient subgroups can help in understanding pathological pathways, diagnosis and defining drug targets. The size of the subgroup and the type of differential expression determine the optimal strategy for subgroup identification. To date, commonly used software packages hardly provide statistical tests and methods for the detection of such subgroups. Different univariate methods for subgroup detection are characterized and compared, both on simulated and on real data. We present an advanced design for simulation studies: Data is simulated under different distributional assumptions for the expression of the subgroup, and performance results are compared against theoretical upper bounds. For each distribution, different degrees of deviation from the majority of observations are considered for the subgroup. We evaluate classical approaches as well as various new suggestions in the context of omics data, including outlier sum, PADGE, and kurtosis. We also propose the new FisherSum score. ROC curve analysis and AUC values are used to quantify the ability of the methods to distinguish between genes or proteins with and without certain subgroup patterns. In general, FisherSum for small subgroups and t-test for large subgroups achieve best results. We apply each method to a case-control study on Parkinson's disease and underline the biological benefit of the new method.

Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures

Journal

PLOS ONE

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Detection of Patient Subgroups with Differential Expression in Omics Data: A Comprehensive Comparison of Univariate Measures

Journal

PLOS ONE

Publisher

PUBLIC LIBRARY SCIENCE

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper