Why do my 'high-quality' variants look like artifacts?
1
0
Entering edit mode
2.1 years ago
Timotheus ▴ 40

Hello,

I have short reads from one non-model genome mapped against a very closely related genome assembly and want to examine variantion. I did standard variant calling, filtering (for reasonable depth, QUAL>50, MQ>55). How is it possibe that variants that passed all those filters look that bad (see the screenshot)?? Those are clearly mismapping reads (and artifactual variants). Why does the MQ filter fail? This is driving me crazy, I'd appreciate any suggestions. enter image description here

SNPs vcf alignment variants • 785 views
ADD COMMENT
2
Entering edit mode
2.1 years ago
LChart 4.6k

The mapping quality of the read you highlighted is 60 (bottom of screenshot) and I don't see visual differences between the reads. So the MQ filter is "failing" because, despite the many mismatches, those are high-quality alignments. My best guess is that, since you've aligned to a closely-related but not identical reference, that the multiple-mismatching reads are from sequence not present in the reference to which you aligned. Also, to my eye, the non-reference allele appears to be occurring on both reads with no other mismatches and reads with lots of mismatches, so that specific variant is "good".

You could perform a pre-pass eliminating reads with edit distance (NM) > 10 (or 8 or whatever). Or, with 118x coverage, you could pass your reads through an assembler and align whole contigs, identify those that seem not to be present in the reference, and include them as "decoys" to mitigate these mappings.

ADD COMMENT
0
Entering edit mode

Interesting suggestions, thank you! I think this variant is only supported by mismapped reads. Might finally try to understand the MQ formula (https://genome.sph.umich.edu/wiki/Mapping_Quality_Scores), but I thought high MQ is impossible with so many differences - unless it force maps reads to the reference and scales MQ relative to other possible alignments??

ADD REPLY

Login before adding your answer.

Traffic: 1919 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6