Entering edit mode
11 months ago
SEJAL
•
0
This query is regarding ChIP-seq analysis:
I have 3 technical replicates for input control and 3 biological replicates for the sample, the paper says to call the peaks independently and take peaks which are common in at least 2 replicates, how should I call peaks independently since I have 3 technical replicates for the input?
I would merge input to increase depth and then call every sample against combined input. Input is just fragmented DNA, it's variability is neglectable imo.
To be fair, a good input would need to be sequenced quite deep to get coverage across the entire genome, but nobody does that, and even if then peak callers such as macs2 would downsample it again to match the ChIP-seq depth which is typcailyl few tens-of-millions of reads. So, don't bother too much with the inputs, just combine and use as one.
combine using samtools merge? or I have also seen people using macs2 callpeak command with all control/input files specified against 1 treatment file. Which one would be better?
Since your input replicates are technical, you can either combine them (by merging at bam level) or specify all input files at macs2 (-c input1,input2,input3) and call peaks individually for each replicates of biological sample one by one (-t sample1). Then take the intersection of peaks from all 3 sample replicates to see how well they overlap. After that it is upto you how you want to proceed for further analysis, i.e either by selecting intersection or union.
for intersection of peaks (bedtools intersect), what should be the ideal fraction of overlap (-f value) and reciprocal overlap (-r) is required or not?
makes little difference. just try. in bioinformatics often it's maybe, not yes/no.