Hi,
I'm using VariantRecalibrator from GATK. I've generated my vcf files with Mpileup/bcftools.
When I am using VariantRecalibrator, with this argument,
java -Xmx4g \
-jar GenomeAnalysisTK-3.1-1/GenomeAnalysisTK.jar \
-T VariantRecalibrator \
-R GRch38.fasta \
-input filtered_cano.vcf \
-resource:dbsnp,known=true,training=false,truth=false,prior=6.0 00-All.vcf \
-an QD \
-an HaplotypeScore \
-an MQRankSum \
-an ReadPosRankSum \
-an FS \
-an MQ \
-mode BOTH \
-recalFile cano.recal \
-tranchesFile cano.tranches \
-rscriptFile cano.plots.R
it is throwing this error message :
##### ERROR A USER ERROR has occurred (version 3.1-1-g07a4bf8):
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR MESSAGE: The provided VCF file is malformed at approximately line number 10161196: unparsable vcf record with allele B
Please suggest me, if I am missing out something in arguments?
I also assume that, GATK doesn't take vcf file, which is generated from samtools.
Thanks!!!!
GATK can take VCF file. Perhaps vcf is the only format it accepts for the Recalibration. It clearly says that the error is with the VCF file and not the arguments.
Paste the line number 10161195,10161196,10161197 here.
Hi,
recognized the error. There was a "B" in REF and ALT column in my dbSNP vcf file.
Thanks
Is that possible if I only use dbSNP as resource or I need to give all three resources (hapmap, omni and dbsnp)?
only dbSNP will work too
I tried with dbSNP but it is asking for some training=true dataset!