I have not done a survey of SNP-callers but I am wondering do any of them assign confidence after annotating against dbSNP rather than before?
I have not done a survey of SNP-callers but I am wondering do any of them assign confidence after annotating against dbSNP rather than before?
Intriguing idea. I don't know of such an approach being implemented in a functional, available SNP-caller. However, a couple points come to mind.
First, not all SNPs in dbSNP are of the same standard or confidence.
Two, such a SNP-caller as envisioned in the question would need to take into account population in the sense that a called SNP might have non-zero minor allele frequency (MAF) in south Asians, for example, but zero MAF in Africans (or in one population from Africa). Think of the results of calculations for Neandertal genome in Homo sapiens: ~4% in non-Africans, essentially 0% in Africans.
Three, applying such to admixed populations would be - complex, to say the least.
So, perhaps all I've done is outline some concerns or constraints for algorithm design. Maybe if there is such a SNP-caller out there, these are a couple points to apply to it as a test of its robustness.
I believe SOAPsnp can do this. MAQ actually has such a sub-tool, but I abandoned it back to 2008. I think this strategy causes more problems than good.
A quick literature search found: http://bioinformatics.oxfordjournals.org/content/25/1/6.long
Apparently the new software they have is called SliderII: http://www.bcgsc.ca/platform/bioinfo/software/SliderII
You are able to use known SNPs as priors to inform the SNP calling. I haven't read much into it yet. This will probably go on my huge to-read pile...
SliderII looks interesting but relies on a reference, which could be from a specific population or not. Thus, the dependency on that references of this tool to call SNPs based on prior means that SNP calling in a "new" population, be that human or another species, may make its implementation difficult.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I am interested in this..so far I could not find one...
SOAPsnp can do that, I think.
I guess there could be some problems in doing what you suggest. How much evidence from the data would you be ready to throw away in order not to disagree with dbSNP? And would not already existing - less biased - filters (e.g. quality of the bases, uniqueness of the mapping, observing the variant on both strands, etc...) already be sufficient? And if they're not sufficient should one not rethink the experimental design - (say sequence less regions with higher coverage?) instead of using very informative priors? I am genuinely unsure, I am not claiming what you suggest is unsuitable.
I am saying common sense would dictate the bar should be lower for calling SNPs that we know exist in our population of interest than for novel SNPs, sometimes called SNV's. As Larry and others have mentioned, the "population of interest" is certainly up for debate.