I have 10 samples with 150x2 PE reads from NovaSeq 6000. I've been following this processing pipeline except that I remove blacklist regions and shift the reads with deeptools when the data is still in BAM format:
alignmentSieve --ATACshift -b ${bam} --blackListFileName mm10-blacklist.v2.bed -o shifted_${bam}
I plan to do differential accessibility and motif enrichment analysis. And used Homer's makeTagDirectories command with three different settings. For downstream commands I need to provide fragment length estimates. I have estimates from Homer and MACS2 but don't know how to interpret them. I'm also surprised that from sample to sample they seem to be very different e.g. 257 vs. 160 in the first column for samples 1 and 2. The samples were all prepared the same way so shouldn't the fragment sizes be close to the same? Also how can I determine which Homer parameters I should be using, based on this answer I thought -sspe -flip. Is this correct?
Here are the results for the first 4 samples
Sample | Homer -sspe -flip | Homer -sspe | Homer | MACS2
1 284 284 257 256
2 156 156 160 156
3 290 290 297 285
4 149 149 202 150
Homer command:
makeTagDirectory ./$name ${bam} -fragLength pe -format sam -keepAll <-sspe> <-flip>
For MACS2 I first converted the BAMs to BEDPEs
macs2 callpeak -t ${bedpe} -f BEDPE -n $name -g mm -B \
--broad --broad-cutoff 0.05 --keep-dup all --outdir ./macs2/