I am trying to use GATK BaseRecalibrator but I get this error: "Input files knownSites and reference have incompatible contigs"
##### ERROR MESSAGE: Input files knownSites and reference have incompatible contigs. Please see https://www.broadinstitute.org/gatk/guide/article?id=63for more information. Error details: The contig order in knownSites and reference is not the same; to fix this please see: (https://www.broadinstitute.org/gatk/guide/article?id=1328), which describes reordering contigs in BAM and VCF files..
##### ERROR knownSites contigs = [chr1, chr10, chr10_GL383545v1_alt, chr10_GL383546v1_alt, chr10_KI270824v1_alt, chr10_KI270825v1_alt, chr11, chr11_GL383547v1_alt, chr11_JH159136v1_alt, chr11_JH159137v1_alt, ......
##### ERROR reference contigs = [chr1, chr10, chr11, chr11_KI270721v1_random, chr12, chr13, chr14, chr14_GL000009v2_random, chr14_GL000225v1_random, chr14_KI270722v1_random, chr14_GL000194v1_random, .....
I searched the Internet and found out that I should use the compatible vcf file and reference, but I downloaded vcf file from hg38 bundle ftp://ftp.broadinstitute.org/bundle/hg38/hg38bundle/ so I suppose it should be compatible. My reference uses 'chr1,etc.', while the vcf file uses '1,2,etc.', so I added chr to the vcf file. I also used Picard ReorderSam on my BAM file and SortVcf on vcf file as described here http://gatkforums.broadinstitute.org/gatk/discussion/1328/errors-about-contigs-in-bam-or-vcf-files-not-being-properly-ordered-or-sorted. Now it looks like my reference is sorted differently (lexicographically sorted). I have no idea how can I fix this problem, should I sort my reference or download other reference file?
Thank you, but I already used this option running SortVcf: java -Xmx3g -jar ../picard.jar SortVcf I=dbsnp_144.hg38_with_chr.vcf O=dbsnp_144.hg38_with_chr_sorted.vcf SEQUENCE_DICTIONARY=../ref/hg38.dict and the problem still exists.
can you please show me the output of the following commands:
$ grep "##contig" -m 10 dbsnp_144.hg38_with_chr.vcf
$ grep "##contig" -m 10 ../ref/hg38.dict
$ grep "##contig" -m 10 dbsnp_144.hg38_with_chr_sorted.vcf
##contig=<ID=chr1,length=248956422>
##contig=<ID=chr10,length=133797422>
##contig=<ID=chr11,length=135086622>
##contig=<ID=chr11_KI270721v1_random,length=100316>
##contig=<ID=chr12,length=133275309>
##contig=<ID=chr13,length=114364328>
##contig=<ID=chr14,length=107043718>
##contig=<ID=chr14_GL000009v2_random,length=201709>
##contig=<ID=chr14_GL000225v1_random,length=211173>
##contig=<ID=chr14_KI270722v1_random,length=194050>
and
head ../ref/hg38.dict
?