4.3 Article

A Topology-Preserving Selection and Clustering Approach to Multidimensional Biological Data

期刊

OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY
卷 15, 期 7-8, 页码 483-494

出版社

MARY ANN LIEBERT, INC
DOI: 10.1089/omi.2010.0066

关键词

-

资金

  1. Chinese Academy of Sciences [KSCX1-YW-22-01]
  2. Ministry of Science and Technology of China [2009CB825607, 2011CB910202]
  3. National Natural Science Foundation [30730033, 90919059]
  4. Shanghai Postdoctoral Scientific Program [09R21414900]
  5. China Postdoctoral Science Foundation [20090450573]
  6. European Community (TB-VIR network) [200973]
  7. BBSRC [BB/G022771/1] Funding Source: UKRI
  8. Biotechnology and Biological Sciences Research Council [BB/G022771/1] Funding Source: researchfish

向作者/读者索取更多资源

Multidimensional genome-wide data (e.g., gene expression microarray data) provide rich information and widespread applications in integrative biology. However, little attention has been paid to the inherent relationships within these natural data. By simply viewing multidimensional microarray data scattered over hyperspace, the spatial properties (topological structure) of the data clouds may reveal the underlying relationships. Based on this idea, we herein make analytical improvements by introducing a topology-preserving selection and clustering (TPSC) approach to complex large-scale microarray data. Specifically, the integration of self-organizing map (SOM) and singular value decomposition allows genome-wide selection on sound foundations of statistical inference. Moreover, this approach is complemented with an SOM-based two-phase gene clustering procedure, allowing the topology-preserving identification of gene clusters. These gene clusters with highly similar expression patterns can facilitate many aspects of biological interpretations in terms of functional and regulatory relevance. As demonstrated by processing large and complex datasets of the human cell cycle, stress responses, and host cell responses to pathogen infection, our proposed method can yield better characteristic features from the whole datasets compared to conventional routines. We hence conclude that the topology-preserving selection and clustering without a priori assumption on data structure allow the in-depth mining of biological information in a more accurate and unbiased manner. A Web server (http://www.cs.bris.ac.uk/similar to hfang/TPSC) hosting a MATLAB package that implements the methodology is freely available to both academic and nonacademic users. These advances will expand the scope of omics applications.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据