This is a question from my collaborator. He has not sent his samples to do the whole genome sequencing yet. Does anyone know the minimum coverage that is necessary for reliable variant calls from Whole Genome Sequencing data?
Thank you so much.
This is a question from my collaborator. He has not sent his samples to do the whole genome sequencing yet. Does anyone know the minimum coverage that is necessary for reliable variant calls from Whole Genome Sequencing data?
Thank you so much.
Ditto what Paul said.
Organism, experimental design and algorithm may also have big effects. Some variant callers (like GATK) make it possible to leverage information across samples, increasing effective depth. Depending on what you're doing, you could also consider RNAseq, which will drastically increase coverage given the same number of reads, but you only see the genes...
If you are looking for causal variants, I have seen some approaches which have succeeded with < 10x depth, but these were in Danio, using large embryo pools, which boosts apparent heterozygosity and makes detecting homozygous regions easier (for instance SNPTrack).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.