Hi there, I have seen the following difference which I cannot explain with samtools mpileup.
First, only -B is specified to disable BAQ calculation, it finds two reads with insertions. I am not sure if this matters, but if you look closely, the two inserted sequences are NOT exactly the same.
$ samtools mpileup -B -f ${REF} ${BAM_FILE} | grep 175200931 [mpileup] 1 samples in 1 input files <mpileup> Set max per-file depth to 8000 chr1 175200931 a 20 ....,...+30CATGAATATATACACACGTATATATACATA..+30CATGTATATATACACACGTATATATACATA.......... Jm_sEHIDFsEHIdJ>JLJD
Second, I rerun the experiment with -BQ0 specified, a third insertion appears.
$ samtools mpileup -BQ0 -f ${REF} ${BAM_FILE} | grep 175200931 [mpileup] 1 samples in 1 input files <mpileup> Set max per-file depth to 8000 chr1 175200931 a 27 ...,.,...+30CATGAATATATACACACGTATATATACATA..+30CATGTATATATACACACGTATATATACATA...,+30catgtatatatacacacgtatatatacata......,,,,.. Jm_!sEHIDFsEHI!dJ>JLJ!!!!,D
From my understanding of -Q0, it means skip bases with BAQ smaller than 0. In other words, since BAQ cannot be samller than 0, it keep all reads whatever their BAQ are.
However, I don't understand if -B is speicified to disable BAQ calculation already, why would -Q0 make a difference? Where comes the BAQ data then?