Hello, I'm trying to call variants using samtools
and bcftools
and (for unknown reasons) I get incorrect base in the REF
column of the resulting VCF file; example:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT ../algn/algn_400K.sorted.bam
... 226300 . T C 119 . DP=465;VDB=0.0418;AF1=0.5;AC1=1;DP4=135,80,131,105;MQ=20;FQ=105;PV4=0.13,0.27,1,1 GT:PL:GQ 0/1:149,0,132:99
But when I check the above position in the reference, the actual base seems to be G
:
In [18]: from Bio import SeqIO
In [19]: [ref] = SeqIO.parse(open("MG1655-K12.fasta"), "fasta")
In [20]: ref[226300]
Out[20]: 'G'
Here's a summary of what I did to obtain the VCF:
$ samtools mpileup -uf MG1655-K12.fasta algn_400K.sorted.bam > 400K.raw.bcf
$ bcftools view -bvcg 400K.raw.bcf > 400K.vcf
Any hints on why this might happen?
Thanks, that's exactly the case!