Hi
First of, pardon my dual post of the same on GATK and here. I understand that GATK guys may not answer this in time also provided this is on a sat night and I am in super time crunch to get some evaluation metrics by monday and hence in a desperation I am seeking help here as well. So my apologies of the bat!
So here I go ...... I am running evaluate variants and encountered this error:
Now I would like to point out that I had the same error my first pass (ran until chr14 only) around and I looked for a solution on GATK forum. I had seen another post on GATK forum with the same error and that person had NOT sorted their vcf file, which is exactly what I had done/not done - not sorted vcf file.So I went back and sorted my input vcf file using picard tool and then fed my sorted file into evaluate variant command as shown below: It ran further more until chrX but then spit out this error below(ERROR).
One thing to point out is my sorted vcf has only few chromosomes I wanted to look at chr2,3,5,7 and 14. Is that an issue? I dont see how, but worth asking. Also is this a bug as I am running the BETA ?
Any help would be appreciated !! Pressed for time !!! Thankyou in advance!!!
MY COMMAND:
gatk VariantEval -R Homo_sapiens_assembly38.fasta -eval fgeno_output_sorted.vcf -O fgeno_variant_eval.tbl -D dbsnp_146.hg38.vcf.gz -no-ev -EV CompOverlap -EV CountVariants -EV IndelSummary -EV MultiallelicSummary -EV TiTvVariantEvaluator
Why am I getting this error even though my file is sorted?
GATK/4.1.8.0
ERROR:
22:42:07.179 INFO ProgressMeter - chrX:124684150 39.6 147787000 3736018.7
22:42:17.192 INFO ProgressMeter - chrX:146492652 39.7 148520000 3738775.7
22:42:26.416 INFO VariantEval - Shutting down engine
[August 15, 2020 10:42:26 PM EDT] org.broadinstitute.hellbender.tools.walkers.varianteval.VariantEval done. Elapsed time: 39.91 minutes.
Runtime.totalMemory()=3758620672
java.lang.IllegalStateException: The elements of the input Iterators are not sorted according to the comparator htsjdk.variant.variantcontext.VariantContextComparator
at htsjdk.samtools.util.MergingIterator.next(MergingIterator.java:107)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:118)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
Update on my posted issue:
I don’t still quite understand why I am getting eval error at chrX. so I went through my input again.
I had specifically extracted few chromosomes only and made my genomicsDB and called joint genotype and then sorted that file and pushed through eval.
However the logs from eval is is going through all chromosomes even though my joint genotype vcf has only specific chromosomes listed. So the chrX that it is stalling is coming from my dbSNP vcf looks like!
The dbsnp vcf I used is from GATK resource bundle and it also has its tbi file along with it.
-D dbsnp_146.hg38.vcf.gz
so the question is, coz I have subset of chromosomes in my input vcf file but -D file has all chromosomes, would that be the source of why eval is breaking?Which still does not make sense coz it did not break on any other chromosomes like chr1, chr6 etc that are also not in my input vcf!!
Do I need to go and extract only those corresponding chromosomes from my -D file and feed that as an -D option?
Any thoughts appreciated !