I've generated some ATAC-seq data and wanted to graph it. Its singly mapped and deduplicated already so I ran samtools mpileup on it. Then I divided each position by the number of reads in millions to get the CPM of each position. I've then tried to graph a larger region, using pyGenomeTracks and bigwig files generated from bamCoverage in deeptools v3.4.3. I ran this command
bamCoverage --normalizeUsing CPM -p 32 -bs 1 -b 01_aligned.dedup.single.sortedbycoord.bam -o 01_aligned.dedup.single.cox2.bw
and then I ran with exact scaling and extended reads
bamCoverage --normalizeUsing CPM -e --exactScaling -p 32 -bs 1 -b 01_aligned.dedup.single.sortedbycoord.bam -o coverage_bw_exact/sample_01_exactscaling_coverage.bw
Both bw, when plotted with pyGenomeTracks gave me peaks higher than my CPM normalized data from mpileup (which I'm inclined to believe due to the simplicity of the normalization of CPM).
Is there a more detailed manual on how deeptools works? I'm not sure what's not working here, while I have a suspicion it is bamCoverage not working as expected I'm not sure. I'd really appreciate some help. Thanks.
Understood, but the regions we're looking at look properly normalized with CPM. I've done normalization with 1x coverage (RPGC) and CPM and noticed very similar trends in these areas. I'm mostly trying to reconcile the different scaling between manual CPM through samtools and bamCoverage, despite CPM not being the most optimal way to normalize all data from ATAC-seq.