Journal
BIOINFORMATICS
Volume 34, Issue 14, Pages 2356-2363Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty137
Keywords
-
Categories
Funding
- Japan Agency for Medical Research and Development [15K18465, 17H06331, 15H02369, 15H05970]
- Platform for Drug Discovery, Informatics, and Structural Life Science
- Grants-in-Aid for Scientific Research [15H02369, 15K18465, 15H05970, 17H06331] Funding Source: KAKEN
Ask authors/readers for more resources
Motivation: Chromatin immunoprecipitation followed by sequencing (ChIP-seq) can detect read-enriched DNA loci for point-source (e.g. transcription factor binding) and broad-source factors (e.g. various histone modifications). Although numerous quality metrics for ChIP-seq data have been developed, the 'peaks' thus obtained are still difficult to assess with respect to signal-to-noise ratio (S/N) and the percentage of false positives. Results: We developed a quality-assessment tool for ChIP-seq data, strand-shift profile (SSP), which quantifies S/N and peak reliability without peak calling. We validated SSP in-depth using >= 1000 publicly available ChIP-seq datasets along with virtual data to demonstrate that SSP provides a quantifiable and sensitive score to different S/Ns for both point-and broad-source factors, which can be standardized across diverse cell types and read depths. SSP also provides an effective criterion to judge whether a specific normalization or a rejection is required for each sample, which cannot be estimated by quality metrics currently available. Finally, we show that 'hidden-duplicate reads' cause aberrantly high S/Ns, and SSP provides an additional metric to avoid them, which can also contribute to estimation of peak mode (point-or broad-source) of samples.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available