Dear all,
I'm currently working on a project involving SNP calling of resistance genes (R-genes) in 96 potato cultivars. We are interested in identifying SNPs (and INDELs) in the NB domain. The NB domains were enriched using PCR and Illumina paired-end read libraries were created for each cultivar. After quality checking (adapter trimming and read quality trimming) the reads were mapped against the potato DM v4.03 genome using NextGenMap. SNP calling is (to be) performed on the known NB-region coordinates.
And now is it time for my question, how should we do the SNP calling? Potato is a tetraploid organism, so theoretically using samtools mpileup should not cover all SNPs/alleles, because (correct me if I'm wrong) samtools is designed for diploid organisms. After a Google search I found three SNP callers (QualitySNPng, freebayes and UNEAK) which should be able to call SNPs in polyploid organisms. My question to the community is if anyone has experience in polyploid (tetraploid) SNP calling and if there is a recommended SNP caller (or if they all behave similarly), or that we maybe should only call SNPs which are called by multiple SNP callers.
Furthermore, we are uncertain about which parameters to use in SNP calling and filtering. The major problem we face is that we are uncertain when we can actually call a SNP; what allele fraction should we use? Or should we call a SNP if it has at least x (high quality) reads supporting it? And what quality score (QUAL as given in the VCF output format, or any other quality measure) is sufficiently high enough to call a SNP with high confidence?
So far we haven't been able to find any satisfying answers to these questions and are therefore uncertain how to proceed. Thanks in advance for anyone taking the time to read this and to anyone who is willing to help us with our problems.
Hi, I was wondering if you found answers to the following your questions?
Would be very useful, as I am also working with a tetraplpid species-. Many thanks
Unfortunately I cannot provide you with the answers. I have not been part of this project for almost 4 years now and have not continued in any fields related to tetraploid SNP calling.