Our group is trying to find new genomic variants associated with a certain phenotypic trait. To do so, we performed DNA capture of regions of interest (~5Mb) on ~10 individuals for each of the populations of interest. We then pooled the DNA and performed two separate sequencing runs.
Now, I've aligned the reads with bowtie2 and estimated allelic frequencies using mpileup from SAMtools. Considering that I have no haplotypes, what would be the most appropriate software/statistical test to find associations between genomic variants and the phenotype?
Fst, chi-sq test, fisher's exact test.
"Considering that I have no haplotypes" -- Forgive my ignorance, but what is your criteria for differentiating haplotypes from all other genetic variants?
What I mean is that I have population specific variant frequencies (For example, I know that at position X of chromosome Y, 25% of population A has a C->T SNP while only 12% of population B has the same SNP), but I have no haplotypes (IE given loci A, B and C with allelles Aa, Bb and Cc, I cannot say which proportion of a subpopulation is abc, which is AbC, which is aBC, etc.)
Is that any clearer?
Yes, thanks... I've just always considered haplotypes to be genetic variants, so I was curious how you defined them in the context of your question.