I am using samtools mpileup for SNP calling, then with BEAGLE I do the haplotyping and with GATK BeagleOutputToVCF I convert the beagle output back to vcf format. Everything is working fine, but I miss one tag.
I want to add the DP tag to the genotype field of the vcf file.
Is there an option in samtools mpileup, BEAGLE or GATK BeagleOutputToVCF which can add this information?
Do I have to use the tool mentioned in incorporating raw read coverage per sample in merged vcf which requires much IO for calculation and is an extra step in my pipeline?
Or is this information somewhere in my vcf file?
My vcf file looks like this:
SL2.40ch12 17 . T C 69.50 . AC=1;AC1=1;AF=0.167;AF1=0.1766;AN=6;DP=39;DP4=10,7,2,3;FQ=70.3;MQ=46;NumGenotypesChanged=0;PV4=0.62,0.24,0.034,1;R2=0.922;RPB=5.484225e-01;VDB=3.008871e-02 GT:GQ:OG:PL
0|0:13:.:0,9,90 0|0:60:.:0,39,255 0|1:21:.:104,0,11
If your version of samtools is new enough (it's present at least in 0.1.18), you can provide the '-D' option to mpileup to get per-sample read depth of high-quality reads (DP in genotype field) and high-quality variant reads (DV in genotype field) (as opposed to the depth across samples, which is indicated by the DP field in the INFO field).
Thanks! this was exactly what I was looking for. Don't know why I couldn't find this option by my own. Now I read the manual again and I saw you were right!
From your example, there is already a DP tag. DP = 39 here.
This is the coverage summed over all different samples, I need the coverage per sample