Question

Unknown Validation Snps

0

Entering edit mode

11.1 years ago

juen85 • 0

Hi, I am working on SNPs data stored in dbSNP, but I have some doubts about SNPs validated as unknown. The term "unknown" is referred to all those SNPs found only one time and they could be effective snp or just a consequence of sequencing error. For most of them, even if they are validated as known, there are several submitters (both research labs and consortia) and the allele frequencies. How do I consider these SNPs? Is it right included them in an experiment? Thank you.

dbsnp • 2.0k views

ADD COMMENT • link updated 11.1 years ago by Vivek ★ 2.7k • written 11.1 years ago by juen85 • 0

score 0 · Answer 1 · 2013-10-23

0

Entering edit mode

11.1 years ago

gammyknee ▴ 210

I think the question is, do you have the capacity to include them? In my opinion those SNPs would almost certainly be sequencing errors, but many could be rare and thus interesting to your analysis. If you are able to accommodate them then you should probably include them.

ADD COMMENT • link 11.1 years ago by gammyknee ▴ 210

score 0 · Answer 2 · 2013-10-24

Depends on your application I guess, if you are including them with something like the GATK variant recalibration module step as a known dataset, you could probably reduce the confidence level of the unknown SNPs and have that reflect in your analysis.

If you are otherwise using them to calculate concordance against a different SNP set, you could likely exclude them.