Hi everyone, first of all Iam a biologist so sorry for the (potentially) very basic question.
We performed ATAC-seq (treatment vs. control in duplicates) and processed the data through the Encode pipeline. The QC (especially the TSS enrichment) shows for both control samples a excellent result whereas both treatment samples are borderline ( but not so bad to trash it). Also the visual inspection of the peaks looks okay (very similar peak pattern between control and treatment). However we noticed that the peaks in the treatment samples are overall smaller compare to the control (bigwig TPM normalized) so we concluded that we have a potential enrichment bias. Since we do not knock down any global factor it is most likely a technical artefact. We continued to identify differential accessible regions using the csaw package ("trended" normalization). The resulting regions are highly enriched for binding motives of biological meaningful transcription factors (identified with HOMER) and also close to relevant genes.
So far so good... however now I have problems to visualise those regions. Since the TPM normalization of the bigwig files does not account for the global enrichment bias (correct ?) the plotted heatmaps of the differential accessible regions do not reflect the result from csaw (control has always higher signal compare to treatment). Please correct me when Iam wrong but in this case we could perform instead of the TPM normalization a quantile normalization right ? Is there a easy way to do this either with bam or bigwig files ? Otherwise I could get a count matrix from deeptools and feed it into the CQN package. However the cqn function requires a covariate and to be honest Iam not sure what this is in my data (sequencing depths ?). After the normalization I could convert the resulting file back into a bigwig file and us it for plotting.
Thanks for any suggestions.
Quantile normalization was the best solution for me, by far, to deal with the visualization problem (after trying quantile normalization, and other approaches as the TMM proposed one which was not solving my visualization bias).
To apply the quantile normalization, briefly, I first splitted the genome in bins of 100 bps with bedtools makewindows. Then I quantified the number of read counts for each bin with featureCounts. Next, I quantile normalized the read counts with the normalize.quantiles() function from the R preprocessCore package. Last with the normalized counts per bin I created a bedgraph and converted it to bigwig.
NOTE: when applying quantile normalization you should be aware of whether you expect global changes or target changes, if you expect global changes you should not apply it for the data correction (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0679-0).