Question

How To Calculate Genotype Likelihood Using Quality Scores For Reads

0

Entering edit mode

11.5 years ago

Mcmahanl ▴ 300

After google search and reading some articles, I still have no idea how to calculate genotype likelihood for a locus using quality scores for reads using simple probability. The probability math symbols in those articles are hard for me to understand how to do the calculation.

For example, for a locus, the reference sequence has T. There are 6 reads, 4 reads have T and 2 reads have G (with quality score = 10) at that locus.

If the true genotype of this locus is [T,T], the question is how does one calculate the probability of this genotype at this locus, i.e. P(D | [T, T])? D for the given data of reads.

ngs • 7.3k views

ADD COMMENT • link updated 11.5 years ago by Pierre Lindenbaum 166k • written 11.5 years ago by Mcmahanl ▴ 300

score 3 · Answer 1 · 2013-11-04

3

Entering edit mode

11.5 years ago

Pierre Lindenbaum 166k

see Heng Li's "Mathematical Notes on SAMtools Algorithms / 4.4 Likelihood of data given genotype " in http://www.broadinstitute.org/gatk/media/docs/Samtools.pdf

ADD COMMENT • link 11.5 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Thank you so much Pierre Lindenbaum, it is really helpful

ADD REPLY • link 10.9 years ago by stat.1405 ▴ 30