Question

What is the minimum coverage adequate for variant calls from Whole Genome Sequencing data?

3

Entering edit mode

8.4 years ago

zhaoyang198691 ▴ 30

This is a question from my collaborator. He has not sent his samples to do the whole genome sequencing yet. Does anyone know the minimum coverage that is necessary for reliable variant calls from Whole Genome Sequencing data?

Thank you so much.

next-gen Whole Geneome Seq coverage • 5.1k views

ADD COMMENT • link updated 8.4 years ago by apa@stowers ▴ 610 • written 8.4 years ago by zhaoyang198691 ▴ 30

score 2 · Answer 1 · 2017-01-09

2

Entering edit mode

8.4 years ago

Paul ★ 1.5k

In case you are calling germline variants - this is interesting article. Basically it is very hard to answer your question. Probably in exome sequencing it could be between 30x-50x. For WGS see this and this interesting article.

ADD COMMENT • link 8.4 years ago by Paul ★ 1.5k

score 2 · Answer 2 · 2017-01-10

Ditto what Paul said.

Organism, experimental design and algorithm may also have big effects. Some variant callers (like GATK) make it possible to leverage information across samples, increasing effective depth. Depending on what you're doing, you could also consider RNAseq, which will drastically increase coverage given the same number of reads, but you only see the genes...

If you are looking for causal variants, I have seen some approaches which have succeeded with < 10x depth, but these were in Danio, using large embryo pools, which boosts apparent heterozygosity and makes detecting homozygous regions easier (for instance SNPTrack).