Dear deeptools developer,
Thanks for developing such a great tool! I recently ran a Cut & Tag experiment(2 X 150bp) and want to use the plots from deeptools to show the signal distribution of different histone markers on TSS regions. However, I got different results when I used "extendReads" or not in bamCoverage function with BPM normalization. Please see the plot shown below:
The plot showed that the male sample (the "m" in the sample name) has a higher reads intensity than female sample (the "f" in the sample name) when I used extendReads, but the direction is opposite when I deactivated extendReads. I thought it was possibly caused by the quantification issue that the mate reads in properly paired reads were quantified twice. So I included "--samFlagInclude 66" in bamCoverage to quantify the mate reads only once with extendReads activated. However, the result showed similar as the one activating extendReads without samFlagInclude parameter (see below):
I am wondering why "extendReads" caused so much different results? This is important since it would results in different interpretation for biology.
deeptools version: 3.5.1 python version: 3.9.13
Here are the codes:
###with extendReads activated
bamCoverage --bam <sample>.bam -o <sample>.extent.bw --ignoreDuplicates --normalizeUsing BPM --blackListFileName ../blacklist.merge.bed --binSize 10 --ignoreForNormalization chrX chrY chrM --extendReads
###with extendReads deactivated
bamCoverage --bam <sample>.bam -o <sample>.extent.bw --ignoreDuplicates --normalizeUsing BPM --blackListFileName ../blacklist.merge.bed --binSize 10 --ignoreForNormalization chrX chrY chrM
###with samFlagInclude 66
bamCoverage --bam <sample>.bam -o <sample>.extent_samflag66.bw --ignoreDuplicates --normalizeUsing BPM --blackListFileName ../blacklist.merge.bed --binSize 10 --ignoreForNormalization chrX chrY chrM
###computeMatrix codes
computeMatrix reference-point --referencePoint TSS -p 20 -b 2500 -a 2500 -R <region>.bed -S <sample>.bw --missingDataAsZero -o <output>.gz
I look forward to your reply! Many thanks in advance!!
Joe Wang
This is expected. CUT&RUN/TAG does not produce a uniform fragment size, hence extending reads to fragments (that is what we are talking about, right?) makes sense. Please make informative titles, "help" is none, it's a forum for bioinfo questions, so of course you get help.