Hi all
I have a few questions about ATAC-seq data analysis. My lab is using ATAC-seq to identify accessible regions in the chromatin and check for differential chromatin accessibility between disease and control state as well as checking for TF binding in open chromatin regions (we usually do motif analysis for this). We currently do not size select our data and we do paired end sequencing.
In order to do motif analysis, should we remove fragments that correspond to nucleosomal reads? Since TFs usually bind in nucleosome free regions, it doesn't make sense to me that we keep larger, nucleosomal fragments. However, I have seen many papers that do not do any sort of size selection (experimental or computational) and I am wondering if I am missing something.
Second, is it necessary to do paired end sequencing for ATAC-seq if we do size selection during library prep? I have also noticed that almost everyone does paired end sequencing for ATAC but I'm not sure why this is the case?
Just because an area is not defined as a nucleosomal free region doesnt mean it isnt one. There maybe TFs binding there that make it look like a nucleosome occupied region....so you would lose it in your motif analyses
That's a fair point kenneth. Do you notice that in your data?
Well Im just going off what I remember from reading in the NucleoAtac github issues pages. Somewhere in there there is a warning that just because a region is not called as an "NFR" does not mean it is not one. It just means there wasnt the evidence required (length, flanking nucleosomes etc).
To be honest, I'm actually going to take a look into Devon's answer below in his suggestion for footprinting.