Entering edit mode
6.5 years ago
Omics data mining
▴
260
Dear all ,
I am working on project deals with exome sequencing and I have to predict SNPs and copy number variant among the tumor and matched normals samples. In order to capture the desired results, any idea what should be the desired mean coverage for samples normal and tumor samples?
I will appreciate all the suggestions. Thank you in advance
Archana
Hello archie!
Questions similar to yours can already be found at:
We have closed your question to allow us to keep similar content in the same thread.
If you disagree with this please tell us why in a reply below. We'll be happy to talk about it.
Cheers!
Seems to me the question is not how to get the mean coverage, but rather what an appropriate coverage would be for this analysis:
re - opened
See for example this paper for germline variation, in which they recommend 13x coverage for 95% sensitivity for heterozygous variants, if I remember correctly... https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4122774/
The results for your tumor samples will depend on the purity of your tumor tissue, I guess, but that's an area I'm not familiar with.
Dear Wouter,
Thanks for your suggestion . I found article, where they were saying to have minimum 30X coverage (which is standard coverage for seach sample in exome to do reliable variant calling). Also , many peoples are working with 50X, or even 100X .
Thank you once again
A read-depth of 30 is the 'sweet spot' at which one can call single nucleotide variants, at least from my own empirical data derived from a clinical setting. Aiming for a higher overall depth of coverage in your sequencing therefore increases the likelihood that you will achieve >30 read-depth at each base position.
Depth of coverage profiles always vary, though, mainly due to GC content and, generally, DNA quality. Be ware also that some regions of the genom are virtually impossible to faithfully sequence with short read NGS, sch as CYP genes, certain exons of VWF, and other genes that have corresponding pseudogenes and/or gene duplications.