How does base quality / mapping quality affect germline calling
0
0
Entering edit mode
5.2 years ago
CY ▴ 750

This question is specific to HaplotypeCaller. I know that this question is better asked at GATK forum. However, I didn't get a good answer. So I am hope to get a second opinion here.

I understand that, during germline calling, pairHMM calculates the likelihood of each haplotype by taking base quality into consideration. However, I didn't find any sort of minimum threshold of likelihood being set.

I suppose region with multiple mismatch gets low likelihood for all possible allele (say AA, AC and CC at given site). If no minimum threshold of likelihood being set, the allele with highest likelihood still get emitted? Or I missed something that prevent this from happening?

haplotypecaller germline calling MAPQ • 1.0k views
ADD COMMENT
1
Entering edit mode

Just fyi, mappability is something different than you seem to think. Of course the MAPQ is influenced by the number of mismatches but mappability is like the intrinsic chance that a given read coming from a certain region of the genome without mismatches at a given read length can be mapped back to that particular genomic loci. Repetitive regions (maybe in combination with short reads) have low mappability as they fit to multiple loci equally well while more complex regions or the same region with long(er) reads have higher mappability.

ADD REPLY
0
Entering edit mode

Thanks for the clarification. The appearance is multiple mismatch with VAF = 1 within short window. I guess most of the should be homologous region?

ADD REPLY
0
Entering edit mode

Please add a link to the question on the GATK forum

ADD REPLY

Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6