Entering edit mode
9.7 years ago
Floydian_slip
▴
170
Hi,
I have another question related with GATK, specifically VariantRecalibrator in the INDEL mode:
$java -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R Ref.fasta -input recal_snps.vcf -mode INDEL -recalFile indel.recal -tranchesFile indel.tranches -rscriptFile indel.recal.plots.R -resource:dbSNP,known=true,training=true,truth=true,prior=6.0 20M_filtered.vcf -an DP -an MQ
INFO 21:59:32,158 HelpFormatter - --------------------------------------------------------------------------------
INFO 21:59:32,164 HelpFormatter - Program Args: -T VariantRecalibrator -R /home/bioinfo/data/genomes/rice/nipponbare_v7.0/index/bwa/Os_nipponbare_v7.0_genome.fasta -input all_chr_recal_snps.vcf -mode INDEL -recalFile all_chr.raw.indel.recal -tranchesFile all_chr.raw.indel.tranches -rscriptFile all_chr.indel.recal.plots.R -resource:dbSNP,known=true,training=true,truth=true,prior=6.0 /home/bioinfo/data/genomes/rice/nipponbare_v7.0/variations/20M_filtered/20M_filtered.vcf -an DP -an MQ
INFO 21:59:32,226 HelpFormatter - Executing as bioinfo@toro on Linux 2.6.32-358.23.2.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.7.0_51-mockbuild_2014_01_15_01_39-b00.
INFO 21:59:32,357 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 21:59:32,502 GenomeAnalysisEngine - Preparing for traversal
INFO 21:59:32,505 GenomeAnalysisEngine - Done preparing for traversal
INFO 21:59:32,505 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 21:59:32,505 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 21:59:32,506 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime
INFO 21:59:32,509 TrainingSet - Found dbSNP track: Known = true Training = true Truth = true Prior = Q6.0
INFO 22:00:02,510 ProgressMeter - Chr1:12909320 221975.0 30.0 s 2.3 m 3.4% 14.5 m 14.0 m
.....
.....
INFO 22:11:32,574 ProgressMeter - Chr12:10944016 6647515.0 12.0 m 108.0 s 95.2% 12.6 m 35.0 s
INFO 22:12:02,577 ProgressMeter - Chr12:26375047 6960773.0 12.5 m 107.0 s 99.4% 12.6 m 4.0 s
INFO 22:12:04,448 VariantDataManager - DP: mean = NaN standard deviation = NaN
INFO 22:12:16,628 GATKRunReport - Uploaded run statistics report to AWS S3
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR MESSAGE: Bad input: Values for DP annotation not detected for ANY training variant in the input callset. VariantAnnotator may be used to add these annotations. See http://gatkforums.broadinstitute.org/discussion/49/using-variant-annotator
Although I do have DP and MQ annotations for the indels in my input vcf (recal_snps.vcf). Here are some examples:
Chr1 11093 . G GACTCCCTCAGTGGTTTTGGAGGGTGGTTTCGCT 7666.65 . AC=2;AF=1.00;AN=2;DP=129;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=30.51;MQ0=0;QD=28.15 GT:AD:DP:GQ:PL 1/1:0,129:129:99:7696,526,0
Chr1 11100 . T TGTTGATCTGG 5820.65 . AC=2;AF=1.00;AN=2;DP=121;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=29.00;MQ0=0;QD=30.27 GT:AD:DP:GQ:PL 1/1:0,121:121:99:5850,391,0
Chr1 11194 . T TTTCTCC 4843.65 . AC=2;AF=1.00;AN=2;DP=101;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=37.97;MQ0=0;QD=29.72 GT:AD:DP:GQ:PL 1/1:0,100:100:99:4873,340,0
Also, it runs fine in the SNP mode with the exact same annotations. So, I don't know if it will help to run VariantAnnotator.
Here is how my dbSNP file looks like:
Chr1 1465 10100001465 A G . PASS . GT ./. 1/1 ./. ./.
Chr1 1482 10100001482 C T . PASS . GT 1/1 0/0 ./. ./. 1/1
Chr1 1573 10100001573 T C . PASS . GT ./. ./. ./. ./. 1/1
Please let me know what can I do to fix this.
Thanks and best regards,
Neil
How did you solve this error ? Did you run VariantAnnotator on this one ?
Did you run VariantAnnotator on your vcf files?