Hello,
I am interested in traces of positive selection in my population data. I have been able to calculate Watterson's Pi and Theta for synonymous and non-synonymous sites for every gene in my genome.
The problem is that I am a bit lost as to how to look for positive/negative selection. I do not really understand what these values are, Pi and Theta. I have seen literature where Pi(A) is divided by Pi(S), that's sort of like dN/dS, and if the ratio is bigger than 1, then we can infer positive selection?
Thanks for any help,
Adrian
This is very interesting, thanks for pointing me in the right direction.
I understand measuring Fst values is a powerful way of identifying genes under diversifying selection. After computing Fst values, is there any way to determine which genes are significantly impacted? Is it possible using a statistical test to determine which genes are significantly evolving quicker then others? I have 9 population samples, so Fst is computed pairwise between any 2 populations.
I may be a bit late to be of help here, but the way I have done this in the past is to calculate Fst on a SNP by SNP basis between each possible population pair, and then see which SNPs fall into the tail of the distribution of the Fst scores (say highest 1% - high probability of positive selection here). From there you can figure out which genes these SNPs fall into relatively easily through using a tool in R called NCBI2R. If you want to look for functional trends you can then run the gene list you get from NCBI2R through a GO term overrepresentation test like GOrilla (web-based and free, also uses FDR instead of the overly conservative Bonferroni correction).