Wondering about opinions on duplicate removal for such things as foot printing in ATAC data. It seems as though Picard at least is extremely aggressive in this regard and removes large amounts of duplicates vs samtools rmdup in my experience. Though Macs2 peaks called with small number of remaining reads (something below 1M) still gives you hundreds of thousands of significant peaks. Confusing.
When I analyse ATAC-seq data I normally remove duplicates using Picard (also remember to change the OPTICAL_DUPLICATE_PIXEL_DISTANCE based on the flowcell). If you have fewer than 1M reads I'd be skeptical about the peaks MACS2 is calling. Without seeing the data I can imagine that the signal is very striated across the genome and it's these tiny spikes which MACS2 is calling as a peak.
As I recall, we've normally been removing duplicates when using ATAC-seq data. Regarding samtools vs. picard, I think even the samtools authors would say "use picard" for paired-end data.