I have used the Deeptools2 packages to produce a plot comparing profiles of histone binding relative to gene start and ends. However the first sample only reaches about 2 on the Y axis, and the second lays around 0. So I am wondering if I am using it incorrectly? In particular the use of normalisation between the merged ChIP reads that are at least 3x coverage compared to the single input sample. Does readCount normalisation normalise per million reads, for example? What is the scale of the Y-axis?
I have put my steps below in case anyone can spot the flaw. Thank you for any guidance!
1) Combined triplicate mapped reads (sorted BAM) of each sample using samtools merge.
2) Use bamCompare to subtract an input read sample (sorted BAM) from the merged reads in step 1.
bamCompare --bamfile1 merged_reps_sampleA.bam --bamfile2 input.bam --scaleFactorsMethod readCount --operation subtract --binSize 50 --blackListFileName blacklist_file.bed -p 8 --effectiveGenomeSize 2900000000 --ignoreDuplicates -o subtracted_input_sampleA.bw
3) Use computeMatrix to create a matrix of reads relative to canonical gene annotation.
computeMatrix scale-regions -S subtracted_input_sampleA.bw subtracted_input_sampleB.bw -R knownCanonical.bed --samplesLabel SampleA SampleB -p 8 -a 5000 -b 5000 -o SampleA_vs_SampleB_matrix.gz
4) Use plotProfile to produce the plot that compares the profile of both samples.
plotProfile --matrixFile SampleA_vs_SampleB_matrix.gz --plotType lines --numPlotsPerRow 1 --perGroup --SampleA_vs_SampleB_matrix_geneProfile_TSS_TES_5k_matrix.tab -o SampleA_vs_SampleB_matrix.gz_geneProfile_TSS_TES_5k.png --regionsLabel "HG19 UCSC canonical genes"
Resulting plot:
Thank you Devon and for pointing out that I should be referring to it as Deeptools2, old habits die hard. Sample B was expected to be lower. Just to clarify, the Y-axis is explained as "the depth normalized difference in coverage", which is the relative difference in coverage after normalisation? Is my use of bamCompare correct for what I am trying to achieve, i.e. showing the difference in histone binding profiles? Does "subtract" reduce the number of normalised ChIP reads (per bin) by the number of normalised input reads (per bin)?
Yes to all of the questions in your comment.