Hello,
I am doing some analysis to identify differential accessible regions between two conditions using atac-seq. The workflow I have followed uses peak calling with macs2 and then differential analysis of these peaks with deseq2. This gave me good results with expected genes changing accessibility .
However, after reading more about atac-seq I am a little confused if my analysis is correct. As atac-seq produces both nfr and nucleosomal fragments, it seems that some workflows recommend filtering the bams for nfr fragments <100bp and then using them for further analysis. However, some workflows also use all reads without filtering which is what I did.
My question is if my current analysis makes sense or if I should repeat it after filtering my bams to <100bp fragments? I am also concerned that in that case I am discarding more than half of all my data. What is the best practice in this case? Also if I do the filtering, should I use the filtered bams for both peak calling and and deseq2 or for only one of these steps?
Thanks
Thanks. I want to do a general comparison between two mouse cell lines one of which is a control and the other lacks a chromatin remodeling protein. Can you explain why you think MACS2 peak calling and differential analysis will not be useful for a general analysis? I have seen some papers using macs2 broad peaks for such analysis.
I am also looking at an atac-seq specific peak caller now (hmmratac) which takes into account nfrs and nucleosomes but again the confusion is once I have the peaks, should I use the unfiltered bams for deseq2 differential analysis?. I will try both and see how different the results are but I am not clear about this conceptually.
MACS2, even in broad mode, will miss larger-scale remodeling. Its broad mode works by merging relatively close called peaks together, which will typically fail for large changes since there will be few if any peaks to merge.
Hi Devon, sorry for continuing on this old question, but regarding this issue: 1) I've seen most workflows/papers filtering for <100 bp, do you have any comment or resources you can point me to as to explain filtering for 120 bp? (because I'd like to increase to 120 bp so I'd keep more reads). 2) I've also seen some workflows setting a minimum of ~40bp for the NFR regions (filtering [40-100] for example), again, do you know of some resources explaining this or could you provide some comment?
Thanks a lot for your help
Thanks for the reply. It's actually mouse, but the data I'm handling is a bit noisy so that's why I was looking for some "external" reference (see below). OK also to your second point.