☆ 4.7 Article

Hierarchical Clustering of High-Throughput Expression Data Based on General Dependences

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2013)

Journal

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Volume 10, Issue 4, Pages 1080-1085

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TCBB.2013.99

Keywords

Algorithms; clustering; similarity measures

Funding

NIH [P20 HL113451, P01-ES016731, P30-AI50409, UL1-RR025008]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

High-throughput expression technologies, including gene expression array and liquid chromatography-mass spectrometry (LC-MS) and so on, measure thousands of features, i.e., genes or metabolites, on a continuous scale. In such data, both linear and nonlinear relations exist between features. Nonlinear relations can reflect critical regulation patterns in the biological system. However, they are not identified and utilized by traditional clustering methods based on linear associations. Clustering based on general dependences, i.e., both linear and nonlinear relations, is hampered by the high dimensionality and high noise level of the data. We developed a sensitive nonparametric measure of general dependence between (groups of) random variables in high dimensions. Based on this dependence measure, we developed a hierarchical clustering method. In simulation studies, the method outperformed correlation-and mutual information (MI)-based hierarchical clustering methods in clustering features with nonlinear dependences. We applied the method to a microarray data set measuring the gene expression in cell-cycle time series to show it generates biologically relevant results. The R code is available at http://userwww.service.emory.edu/-tyu8/GDHC.

Hierarchical Clustering of High-Throughput Expression Data Based on General Dependences

Journal

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Hierarchical Clustering of High-Throughput Expression Data Based on General Dependences

Journal

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper