Hello there
I am running samtools mpileup and get an output for the requested positions. My problem is that if there are multiple alternative bases, it's not possible to extract the alternative base count for each.
The command line that I use:
samtools mpileup -u -Q 0 -t DP4 -t DV -t DP -f GRCh37.fa --positions positions.bed SAMPLE_1-ready.bam SAMPLE_2-ready.bam SAMPLE_3-ready.bam | bcftools view > SAMPLE-variant-sites.vcf
An example output:
12 25398284 . C G,T,<X> 0 . DP=612;I16=351,184,53,24,19737,1.05346e+06,2839,140417,32100,1.926e+06,4620,277200,9893,221105,1557,36041;QS=2.58612,0.40494,0.0089373,0;VDB=0.0247352;SGB=4.66664;RPB=0.97648;MQB=1;MQSB=1;BQB=0.965306;MQ0F=0 PL:DP:DV:DP4 0,255,255,255,255,255,255,255,255,255:208:3:133,72,2,1 255,0,255,255,255,255,255,255,255,255:186:74:74,38,51,23 0,255,255,255,255,255,255,255,255,255:218:0:144,74,0,0
As you may see, there are G as well as T alternative reads . And if we look at the data from SAMPLE_1, we can see that there is only one DP4 field: 133,72,2,1
. So we know that three bases are either G or T in that position but we do not know how bases are G and how many are T
How would I achieve this??
Great! You saved my day. I was still using v1.2 and could not figure out how to get this info.
By the way - did you ever notice problems for calling the correct AD for deletions? I experience many cases where the AD is equal to DP meanwhile in truth the deletion is present in only a fraction of the reads. It puzzles me as insertions seem to be called correctly and other deletions as well.