Question

A General Evaluation Of Chipseq Data

7

Entering edit mode

13.2 years ago

Hamilton ▴ 290

is there any measure to validate if the one chipseq experiment is good or not?? that is, the data are good enough to analyze or the second experiment should be redone as the 1st experiment was not good? how can we measure such a quality usually? sort of quality measure to account for non-redundant reads for peak regions at a given sequencing depth.(for say, 30M reads in the experiment)?, namely, how many non-redundant reads are mapped in peak regions from the experiment? Otherwise, any other criteria to consider??

basically, how do we evaluate a chipseq experiment quality? Quality scores from galaxy using fastq files are good enough? any other statistical measures?

Any comments are welcome.

chip-seq quality • 5.6k views

ADD COMMENT • link updated 10.1 years ago by Biostar 20 • written 13.2 years ago by Hamilton ▴ 290

score 10 · Answer 1 · 2012-03-07

The ENCODE and Roadmap projects have adopted a number of QC criteria for ChIP-SEQ data:

Proportions of aligned and duplicate reads
Peak enrichment
Peak 'Lag' between strands
Reproducibility between replicates

See the documents below for more detail.

http://genome.ucsc.edu/ENCODE/protocols/dataStandards/ChIP_DNase_FAIRE_DNAme_v2_2011.pdf

http://www.roadmapepigenomics.org/files/protocols/data/histone-modification/REMC_ChIP-seqStandardsFINAL.pdf

score 4 · Answer 2 · 2012-03-06

Usually one would look whether there is significant enrichment of reads in the ChIP sample compared to the control (eg. input DNA or nonspecific antibody) in specific regions. Perhaps some regions of enrichment are known in advance; if not, some peaks you are seeing in the data could be checked with ChIP-qPCR.

Computationally, if you know the motif for the TF you are doing ChIP on (if it is a TF) you can check whether the motif is significantly enriched in the peaks you get.

On a more basic level, I would find it worrying if I did peak calling for both ChIP over control and (reversed) control over ChIP, and got more peaks for control over ChIP.

score 2 · Answer 3 · 2012-03-14

2

Entering edit mode

13.1 years ago

Ian 6.1k

Just a couple of tools i have found useful to answer this question:

HOMER (Perl based)

htSeqTools (R based; i particularly like the PCA represenation of coverage similarity between the samples)

ADD COMMENT • link 13.1 years ago by Ian 6.1k

0

Entering edit mode

It'd be more helpful if you could briefly say why one would want to use the tool/what to look for in the output.

EDIT: I now realise this post is three years old...

ADD REPLY • link 10.1 years ago by Saulius Lukauskas ▴ 540