Question

Most Appropriate Analysis For Population Specific Genotype Frequencies

0

Entering edit mode

12.4 years ago

Eric Fournier ★ 1.4k

Our group is trying to find new genomic variants associated with a certain phenotypic trait. To do so, we performed DNA capture of regions of interest (~5Mb) on ~10 individuals for each of the populations of interest. We then pooled the DNA and performed two separate sequencing runs.

Now, I've aligned the reads with bowtie2 and estimated allelic frequencies using mpileup from SAMtools. Considering that I have no haplotypes, what would be the most appropriate software/statistical test to find associations between genomic variants and the phenotype?

association sequencing genotyping • 2.6k views

ADD COMMENT • link 12.3 years ago by Eric Fournier ★ 1.4k

4

Entering edit mode

Fst, chi-sq test, fisher's exact test.

ADD REPLY • link 12.4 years ago by Zev.Kronenberg 12k

0

Entering edit mode

"Considering that I have no haplotypes" -- Forgive my ignorance, but what is your criteria for differentiating haplotypes from all other genetic variants?

ADD REPLY • link 12.4 years ago by Josh Herr 5.8k

0

Entering edit mode

What I mean is that I have population specific variant frequencies (For example, I know that at position X of chromosome Y, 25% of population A has a C->T SNP while only 12% of population B has the same SNP), but I have no haplotypes (IE given loci A, B and C with allelles Aa, Bb and Cc, I cannot say which proportion of a subpopulation is abc, which is AbC, which is aBC, etc.)

Is that any clearer?

ADD REPLY • link 12.4 years ago by Eric Fournier ★ 1.4k

0

Entering edit mode

Yes, thanks... I've just always considered haplotypes to be genetic variants, so I was curious how you defined them in the context of your question.

ADD REPLY • link 12.4 years ago by Josh Herr 5.8k

score 0 · Answer 1 · 2013-01-07

0

Entering edit mode

12.3 years ago

Eric Fournier ★ 1.4k

I've found that the article Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples and the software EM-SNP provided all of the information necessary to determine how to best analyze my data.

ADD COMMENT • link 12.3 years ago by Eric Fournier ★ 1.4k