期刊
BMC BIOINFORMATICS
卷 12, 期 -, 页码 -出版社
BIOMED CENTRAL LTD
DOI: 10.1186/1471-2105-12-S1-S1
关键词
-
类别
资金
- National Science Foundation [NSF/DBI-0542119, NSF/DBI-0542119004, NSF/DEB-0830024, NSF/DBI-0821263, DOE/4000063512]
- National Institutes of Health [1R01GM075331, 1R01GM081682]
- Georgia Cancer Coalition
- National Science Foundation of China [60673059, 60373025, 10926027]
- Taishan Scholar Fund from Shandong Province
- State Scholarship Fund of China [20073020]
- China Postdoctoral Science Foundation [20090450396]
- Scientist Research Fund of Shandong Province [BS2009SW044]
- University of Jinan [XBS0914]
- Office of Biological and Environmental Research in the DOE Office of Science
- University of Georgia Research Computing Center
Background: Reconstruction of biological pathways is typically done through mapping well-characterized pathways of model organisms to a target genome, through orthologous gene mapping. A limitation of such pathway-mapping approaches is that the mapped pathway models are constrained by the composition of the template pathways, e. g., some genes in a target pathway may not have corresponding genes in the template pathways, the so-called missing gene problem. Methods: We present a novel pathway-expansion method for identifying additional genes that are possibly involved in a target pathway after pathway mapping, to fill holes caused by missing genes as well as to expand the mapped pathway model. The basic idea of the algorithm is to identify genes in the target genome whose homologous genes share common operons with homologs of any mapped pathway genes in some reference genome, and to add such genes to the target pathway if their functions are consistent with the cellular function of the target pathway. Results: We have implemented this idea using a graph-theoretic approach and demonstrated the effectiveness of the algorithm on known pathways of E. coli in the KEGG database. On all KEGG pathways containing at least 5 genes, our method achieves an average of 60% positive predictive value (PPV) and the performance is increased with more seed genes added. Analysis shows that our method is highly robust. Conclusions: An effective method is presented to find missing genes in biological pathways of prokaryotes, which achieves high prediction reliability on E. coli at a genome level. Numerous missing genes are found to be related to knwon E. coli pathways, which can be further validated through biological experiments. Overall this method is robust and can be used for functional inference.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据