BaseRecalibrator User error.
1
0
Entering edit mode
19 days ago

Hi everyone,

I am new to bioinformatics and I am struggling with GATK's somatic mutation variant calling pipeline.

I have completed most of the preprocessing steps: CreateSequenceDictionary, bwa index, bwa mem, and MarkDuplicatesSpark.

Yet, I've been struggling with a UserError on the BaseRecalibrator step.

For my known sites file, I have been using a C57/BL6 known sites vcf file I found on the Mouse Genome project website.

For the reference genome, I used the GRCm39 latest release.

My initial error with BaseRecalibrator was that my contigs were incompatible between reference and vcf file. I tried to solve this by using bcftools annotate --rename-chrs to alter the vcf files.

Yet, now I am getting a new error:

A USER ERROR has occurred: Input files reference and features have incompatible contigs: Found contigs with the same name but different lengths: contig reference = NC_000067.7 / 195154279 contig features = NC_000067.7 / 195471971.

At this point, I am not sure if I should just redo the analysis with an older version of the mouse reference genome, or if this error can be fixed. Any pointers?

GATK Mutect2 • 320 views
ADD COMMENT
0
Entering edit mode
19 days ago
GenoMax 146k

For my known sites file, I have been using a C57/BL6 known sites vcf file I found on the Mouse Genome project website.

Do you know what genome build this was based on?

For the reference genome, I used the GRCm39 latest release.

My initial error with BaseRecalibrator was that my contigs were incompatible between reference and vcf file. I tried to solve this by using bcftools annotate --rename-chrs to alter the vcf files.

Are you certain you are not mixing and matching genome builds. You can't do this. Results will be nonsense if you do.

Found contigs with the same name but different lengths:

This is more or less an indication that there is some sort of mismatch between the files you are using.

ADD COMMENT
0
Entering edit mode

I'm an idiot... I just checked, yes, the vcf file was for the GRCm38_68 from Sanger. That makes total sense. I think this was the issue. Thanks a lot!

ADD REPLY

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6