4.7 Article

Knowledge-based data analysis comes of age

Journal

BRIEFINGS IN BIOINFORMATICS
Volume 11, Issue 1, Pages 30-39

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bib/bbp044

Keywords

Bayesian analysis; computational molecular biology; signal pathways; metabolic pathways; databases

Funding

  1. NCI NIH HHS [P30 CA006973] Funding Source: Medline
  2. NATIONAL CANCER INSTITUTE [P30CA006973] Funding Source: NIH RePORTER

Ask authors/readers for more resources

The emergence of high-throughput technologies for measuring biological systems has introduced problems for data interpretation that must be addressed for proper inference. First, analysis techniques need to be matched to the biological system, reflecting in their mathematical structure the underlying behavior being studied. When this is not done, mathematical techniques will generate answers, but the values and reliability estimates may not accurately reflect the biology. Second, analysis approaches must address the vast excess in variables measured (e.g. transcript levels of genes) over the number of samples (e.g. tumors, time points), known as the 'large-p, small-n' problem. In large-p, small-n paradigms, standard statistical techniques generally fail, and computational learning algorithms are prone to overfit the data. Here we review the emergence of techniques that match mathematical structure to the biology, the use of integrated data and prior knowledge to guide statistical analysis, and the recent emergence of analysis approaches utilizing simple biological models. We show that novel biological insights have been gained using these techniques.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available