4.6 Article

Finding disease similarity based on implicit semantic similarity

期刊

JOURNAL OF BIOMEDICAL INFORMATICS
卷 45, 期 2, 页码 363-371

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2011.11.017

关键词

Ontology terms; Similarity measure; Disease similarity; Semantic similarity; Gene Ontology; Ontology perturbation; Ontology based disease similarity

向作者/读者索取更多资源

Genomics has contributed to a growing collection of gene-function and gene-disease annotations that can be exploited by informatics to study similarity between diseases. This can yield insight into disease etiology, reveal common pathophysiology and/or suggest treatment that can be appropriated from one disease to another. Estimating disease similarity solely on the basis of shared genes can be misleading as variable combinations of genes may be associated with similar diseases, especially for complex diseases. This deficiency can be potentially overcome by looking for common biological processes rather than only explicit gene matches between diseases. The use of semantic similarity between biological processes to estimate disease similarity could enhance the identification and characterization of disease similarity. We present functions to measure similarity between terms in an ontology, and between entities annotated with terms drawn from the ontology, based on both co-occurrence and information content. The similarity measure is shown to outperform other measures used to detect similarity. A manually curated dataset with known disease similarities was used as a benchmark to compare the estimation of disease similarity based on gene-based and Gene Ontology (GO) process-based comparisons. The detection of disease similarity based on semantic similarity between GO Processes (Recall = 55%, Precision = 60%) performed better than using exact matches between GO Processes (Recall = 29%, Precision = 58%) or gene overlap (Recall = 88% and Precision = 16%). The GO-Process based disease similarity scores on an external test set show statistically significant Pearson correlation (0.73) with numeric scores provided by medical residents. GO-Processes associated with similar diseases were found to be significantly regulated in gene expression microarray datasets of related diseases. (C) 2011 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据