Mutect2 error: Badly formed genome unclippedLoc
1
2
Entering edit mode
3.4 years ago

Hi everyone,

I'm trying to run Mutect2 for WES cancer data.

However, since their Resource bundle only supports h19 seems I cannot proceed (I want to compare it with Strelka2 results).

I've been looking for some hg38 interval_list file and I found: hg38_v0_HybSelOligos_whole_exome_illumina_coding_v1_whole_exome_illumina_coding_v1.Homo_sapiens_assembly38.targets.interval_list

However, when I run the GenomicsDBImport I get the error (no matter if I use my own hg38 reference and .dict or the ones from GATK Resource Bundle):

A USER ERROR has occurred: Badly formed genome unclippedLoc: Contig chr1 given as location, but this contig isn't present in the Fasta sequence dictionary

So, my questions are:

  1. Does anyone know if there is any release date for this hg38 based exome interval file?
  2. Or the file I put is ok and the error is coming from somewhere else?
Mutect2 GenomicsDB WES GATK • 4.8k views
ADD COMMENT
0
Entering edit mode

Hi

I have the same problem in mutect2. I changed my reference file several times, using NCBI, ensemble and broadinstitute google bundle but still face the same error. I made the dictionary using either picard and gatk CreateSequenceDictionary. but still get the error. Do you know what might be the problem? I would appreciate any input on this, feel frustrated.

Thank you so much

ADD REPLY
0
Entering edit mode

I strongly recommend you download the FAI and DICT files from the GATK Resource bundle as well. I had the same errors with the dictionary that I made by myself.

ADD REPLY
0
Entering edit mode

may be "chr1" in your reference call "1"

ADD REPLY
0
Entering edit mode

OP has already mentioned what their error was. I'm moving your answer to a comment.

ADD REPLY
1
Entering edit mode
3.4 years ago

For the ones interested:

This error was a product of incorrect indexing of Fasta reference.

I used the hg38 reference available on the NCBI web.

However, all the chromosome regions are listed as NC_10000.0 codes instead of 1, 2, 3 or chr1 chr2 and chr3.

I had to change it to another reference (I downloaded both .FA and .FAI from GATK Resource Bundle but it seems to work fine with hg38 from Ensembl too).

ADD COMMENT

Login before adding your answer.

Traffic: 2165 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6