Gatk Indel Realignment Error - Mismatch In Index Files And Dict File
3
1
Entering edit mode
12.7 years ago

Greetings,

I am aligning pooled sequencing data to a new renferece genome. GATK won't generate intervals because not every scaffold in the reference is found in my bam index?

What am I missing? It seems like what I am trying to do isn't unreasonable.

  INFO  15:52:42,596 GATKRunReport - Uploaded run statistics report to AWS S3 
    ##### ERROR ------------------------------------------------------------------------------------------
    ##### ERROR A USER ERROR has occurred (version 1.5-3-gbb2c10b): 
    ##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
    ##### ERROR Please do not post this error to the GATK forum
    ##### ERROR
    ##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
    ##### ERROR Visit our wiki for extensive documentation <http://www.broadinstitute.org/gsa/wiki>
    ##### ERROR Visit our forum to view answers to commonly asked questions <http://getsatisfaction.com/gsa>
    ##### ERROR
    ##### ERROR MESSAGE: Couldn't read file /home/zkronenb/Projects/xxx/reference_assembly/withoutsanger.fa because Sequence dictionary and index contain different numbers of contigs
gatk indel error • 5.5k views
ADD COMMENT
1
Entering edit mode
12.7 years ago
Wen.Huang ★ 1.2k

the latest GATK seems to be able to generate .dict on the fly, try remove the existing .dict and have GATK regenerate it

ADD COMMENT
0
Entering edit mode

Thanks! I was stuck in old habits.

ADD REPLY
1
Entering edit mode
12.7 years ago

https://getsatisfaction.com/gsa/topics/a_simple_problem_with_creating_dict_files

I was conducting indel realignments on a non-model organism. I was getting an error stating: loc:malformed unknown contig. The source of this error came from my reference fasta. The lines weren't wrapped so the last contig wasn't being added to my dict file. I wrapped the lines and everything works fine.

ADD COMMENT
0
Entering edit mode
12.7 years ago
Johan ▴ 890

In addition to wen.huangs answer. Make sure that the reference genome your are using contains the contig names as the genome you aligned against. You can check that they match by looking at the bam file using "samtools view -h", and compare the names there to those found in the reference.

ADD COMMENT

Login before adding your answer.

Traffic: 1785 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6