Hi everyone,
I have been processing paired end 150bp ATAC-seq data, but failing to get peaks at known promoters, the data just looks like noise through out.
Started with QC, where reads had poly(G) towards their ends, this is known to happen in NovaSeq (two-color chemistry). Trimmed reads using cutadapt to remove adapter sequences and poly(G). After which I proceed with alignment using bowtie2, alignment rate ranged from ~98 to 99% for all samples and mapped reads varied from 70 to 150 million. Mitochondrial content ranged from 10~50%.
Below is the image of Bio-analyzer run for a sample and the fragment size distribution for the same.
IGV view showing BAM coverage for different samples at GAPDH promoter
I performed peak calling, got around ~300 to 2000 peaks, they seem to be noise when I cross-check some of them on the viewer/genome browser bigwig tracks. Diffbind analysis gave no significantly different peaks between 2 sample groups.
Has anyone ever come across such a problem? What could have gone wrong in the experiment or data analysis part?
Yes, definitely lots of noise and few peaks for such an experiment. Which celltype is that, and which lab protocol did you use?
We used U2OS cells and followed the protocol given here.
We use the OmniATAC protocol, which is an enhanced version. Celltype usually does not matter. Based on many years with this assay I can say that if it fails then usually something was done that was not in the protocol. Cannot really help any further from remote, unless you have specific quesions.
Do you think there's a possibility that something may have gone wrong in the sequencing step, as the bio-analyzer run seems to look okay or should it have been better?
You have reads, so sequencing is obviously fine. This is how the banding should look: https://kb.10xgenomics.com/hc/article_attachments/360028023452/Screen_Shot_2019-05-08_at_1.55.46_PM.png
You essentially have low to no banding with an excess of short fragments but not really any histone banding. Why that is I cannot tell.