Hello!
I'm using this command line:
samtools mpileup SAMPLE.bam -f GENOME.fasta -B -d 1000 -R -Q 20 -l BEDFILE.bed -t ADF,ADR,SP,INFO/AD,INFO/ADF,INFO/ADR -v > OUTPUT.vcf
and the resulting VCF lines look pretty much like this:
chr2 70000005 . T G,<*> 0 . DP=64;ADF=29,0,0;ADR=23,3,0;AD=52,3,0;I16=29,23,0,3,1942,74680,82,2354,3060,183600,180,10800,1072,25682,60,1350;QS=0.958481,0.041519,0;VDB=0.148533;SGB=-0.511536;RPB=0.289223;MQB=0.998456;MQSB=0.970565;BQB=0.0683671;MQ0F=0.03125 PL:SP:ADF:ADR 0,88,255,157,255,255:10:29,0,0:23,3,0
which is different from a "typical" VCF line (example taken from the internet):
chr10 3360 . T C 103 . DP=31;AF1=0.5;CI95=0.5,0.5;DP4=6,3,13,9;MQ=20;FQ=27;PV4=1,0.13,1,0.015 GT:PL:GQ 0/1:133,0,54:57
In particular I was interested in the AF1 and DP4 information which is not provided.
Any ideas?
Hi!
Thanks a lot, unfortunately it still doesn't provide such information. Is it possible that it is because I generated a VCF with samtools mpileup instead of a BCF like you did? That doesn't actually make sense to me, but it seems to be the only difference in the pipeline.
For DNA-seq, I do the alignment with bwa and then it's just a series of samtools and Picard commands before I reach the final variant calling step. Which aligner did you use?
I was using .bam files from Genome In A Bottle project, i belive they are aligned with novalign. I just tried also with TCGA samples that are aligned with BWA (if I am correct) and these are the results:
chr10 47663 . C T 218 . DP=86;ADF=10,19;ADR=0,41;AD=10,60;DPR=10,60;VDB=0.923183;SGB=-0.693147;RPB=0.688282;MQB=0.284284;MQSB=0.443203;BQB=0.211651;MQ0F=0.313953;AC=2;AN=2;DP4=10,0,19,41;MQ=21 GT:PL:DP:DV:SP:DP4:ADF:ADR:AD:DPR 1/1:245,46,0:70:60:43:10,0,19,41:10,19:0,41:10,60:10,60
now I can see the DP4 value, but still AF1 is missing.
It may be related to program versions, in that case.
I use bwa-0.7.12 (
bwa mem
), samtools-1.3.1, and bcftools-1.3.1