I used HaplotypeCaller on the cram file of sample NA12878 from IGSR, the reference fasta is also downloaded from them, the bed interval is self-made (tab delimited). My specs:
- GATK version used: 4.1.4.1
Exact GATK commands used:
gatk HaplotypeCaller -I NA12878.final.cram -O NA12878.final.vcf -R GRCh38_full_analysis_set_plus_decoy_hla.fa -L vdj_hg38.bed
The entire error log:
Runtime.totalMemory()=187695104 java.lang.NullPointerException at java.base/java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:325) at java.base/java.util.ComparableTimSort.sort(ComparableTimSort.java:202) at java.base/java.util.Arrays.sort(Arrays.java:1315) at java.base/java.util.Arrays.sort(Arrays.java:1509) at java.base/java.util.ArrayList.sort(ArrayList.java:1749) at java.base/java.util.Collections.sort(Collections.java:145) at org.broadinstitute.hellbender.utils.IntervalUtils.sortAndMergeIntervals(IntervalUtils.java:492) at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:990) at org.broadinstitute.hellbender.utils.IntervalUtils.getIntervalsWithFlanks(IntervalUtils.java:1005) at org.broadinstitute.hellbender.engine.MultiIntervalLocalReadShard.<init>(MultiIntervalLocalReadShard.java:59) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.makeReadShards(AssemblyRegionWalker.java:104) at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:84) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:137) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:191) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:210) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206) at org.broadinstitute.hellbender.Main.main(Main.java:292)
I have tried to run HaplotypeCaller with no -L, or -L with manual input, or bed file with single line, it works fine. But my actual bed file has multiple line, e.g.
chr2 88,857,361 89,330,679
chr14 105,566,277 106,879,844
chr22 22,026,076 22,922,913
Could you please help me on what have gone wrong?
Edit: summary of what I have tried:
- Remove the comma for the example: work
- Used another bed file, contain intervals from chr1 to chr9: work
- Used another bed file, contain intervals from chr10: NOT work
- Used another bed file, contain intervals from chr1 to chr10: NOT work
- Used another bed file, contain intervals from chr11 to chr19: work
- Used another bed file, contain intervals from chr11 to chr22: NOT work
- Used another bed file, contain intervals from chr1 to chr9 and chr11 to chr19: work
You should remove the all the
,
from coordinates and make sure the bed file is sorted and formatted properly.BED file format
I have used another bed file follow that but it still give Null result
Have you sorted this bed file?
Yes, the file is sorted by chr and then position, am I right? I used the bed file exactly in this order
Please see my updated post
I have done some manual testing, I found that I can use many interval, as long as they in range of chr1 - chr9, every time I put a line of chr10 - chr22, this error appears. Really weird
It's hard to tell from this excerpt, what's wrong with chr10 - chr22. You can try the following troubleshooting steps.
1) Make sure the interval file is restricted to the chromosome size in your reference file (for both alignment and variant calling).
2) Make sure the bed file you are using is a valid one. Seqanswers thread
3) Chromosome nomenclature matches in your reference and interval file.
4) Make sure your bed file follows the recommended specification by GATK interval documentation.