☆ 4.3 Article

A Topology-Preserving Selection and Clustering Approach to Multidimensional Biological Data

OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY (2011)

期刊

OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY

卷 15, 期 7-8, 页码 483-494

出版社

MARY ANN LIEBERT, INC

DOI: 10.1089/omi.2010.0066

关键词

类别

Biotechnology & Applied Microbiology Genetics & Heredity

资金

Chinese Academy of Sciences [KSCX1-YW-22-01]
Ministry of Science and Technology of China [2009CB825607, 2011CB910202]
National Natural Science Foundation [30730033, 90919059]
Shanghai Postdoctoral Scientific Program [09R21414900]
China Postdoctoral Science Foundation [20090450573]
European Community (TB-VIR network) [200973]
BBSRC [BB/G022771/1] Funding Source: UKRI
Biotechnology and Biological Sciences Research Council [BB/G022771/1] Funding Source: researchfish

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Multidimensional genome-wide data (e.g., gene expression microarray data) provide rich information and widespread applications in integrative biology. However, little attention has been paid to the inherent relationships within these natural data. By simply viewing multidimensional microarray data scattered over hyperspace, the spatial properties (topological structure) of the data clouds may reveal the underlying relationships. Based on this idea, we herein make analytical improvements by introducing a topology-preserving selection and clustering (TPSC) approach to complex large-scale microarray data. Specifically, the integration of self-organizing map (SOM) and singular value decomposition allows genome-wide selection on sound foundations of statistical inference. Moreover, this approach is complemented with an SOM-based two-phase gene clustering procedure, allowing the topology-preserving identification of gene clusters. These gene clusters with highly similar expression patterns can facilitate many aspects of biological interpretations in terms of functional and regulatory relevance. As demonstrated by processing large and complex datasets of the human cell cycle, stress responses, and host cell responses to pathogen infection, our proposed method can yield better characteristic features from the whole datasets compared to conventional routines. We hence conclude that the topology-preserving selection and clustering without a priori assumption on data structure allow the in-depth mining of biological information in a more accurate and unbiased manner. A Web server (http://www.cs.bris.ac.uk/similar to hfang/TPSC) hosting a MATLAB package that implements the methodology is freely available to both academic and nonacademic users. These advances will expand the scope of omics applications.

A Topology-Preserving Selection and Clustering Approach to Multidimensional Biological Data

期刊

OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY

出版社

MARY ANN LIEBERT, INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A Topology-Preserving Selection and Clustering Approach to Multidimensional Biological Data

期刊

OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY

出版社

MARY ANN LIEBERT, INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文