How many reads needed to be sure a variant does not exist?
1
0
Entering edit mode
6.8 years ago
steve ★ 3.5k

We are doing targeted exome sequencing with variant calling, and need to determine the minimum number of reads required to be 95% certain that a variant is not present in a given target region. How do you typically do that?

I was thinking some type of power analysis, but wasn't sure what values to use for which parameters.

Are there other ways to do this?

variant calling • 1.4k views
ADD COMMENT
2
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks for the link. Just curious, did you have this on hand, or did you find it on Google? I spent some time trying to Google this topic but did not find this paper, must have been using the wrong keywords :)

ADD REPLY
0
Entering edit mode

I knew it existed, from a couple of years ago. I faintly remembered the first author's name. No idea how I found it originally.

ADD REPLY
4
Entering edit mode
6.8 years ago
Gabriel R. ★ 2.9k

I assume you are talking about heterozygous sites. If we live in a perfect world where we sample each chromosome with p=1/2, at a coverage of 5X, you will only observe one particular allele with p=(1/2)^5 = 0.03125. So only observing one allele but not the other is twice that.

However, having observed a divergent base is not an immediate indicator of a variant. It could be due to mismapping, sequencing error, residual adapter, some contamination etc.

Genotypers are usually equipped to quantify your belief in a particular genotype versus another. I suggest that you look at genotyping output at various coverage and think about the confidence you want for a particular genotype.

ADD COMMENT

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6