Journal
BIOINFORMATICS
Volume 35, Issue 5, Pages 778-786Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty696
Keywords
-
Categories
Funding
- National Institute of Arthritis and Musculoskeletal and Skin Diseases [RO1AR05974]
- National Center for Advancing Translational Sciences (BU-CTSI) [1UL1TR001430]
Ask authors/readers for more resources
Motivation Clustering algorithms like K-Means and standard Gaussian mixture models (GMM) fail to account for the structure of variability of replicated data or repeated measures over time. Additionally, a priori cluster number assumptions add an additional complexity to the process. Current methods to optimize cluster labels and number can be inaccurate or computationally intensive for temporal gene expression data with this additional variability. Results An extension to a model-based clustering algorithm is proposed using mixtures of mixed effects polynomial regression models and the EM algorithm with an entropy penalized log-likelihood function (EPEM). The EPEM is used to cluster temporal gene expression data with this additional variability. The addition of random effects in our model decreased the misclassification error when compared to mixtures of fixed effects models or other methods such as K-Means and GMM. Applying our method to microarray data from a fracture healing study revealed distinct temporal patterns of gene expression.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available