Dear all,
We are looking for the best simulation software to simulate the haplotype structure and allele frequency in laboratory populations (pedigree information available). The real data we have been using is genome-wide snp data.
We study signatures of positive selection in laboratory populations that have been experimentally selected for several generations. To distinguish signals of positive selection from patterns created by genetic drift I believe it is common to do simulations under a neutral model following the same demographic parameters as the studied population and assuming there has been no positive selection. Then one can test the output of these simulations using the same tests used to detect selection in the real population. Then it is possible to see if neutrality can produce the same signatures of positive selection that are detected in the study population. Any strong signals of selection (eg excessive LD) not seen in the neutral simulations but only in the study population are then considered more likely to be true signals.
My question is what simulation software would be appropriate for these simulations? Macs, the new faster version of MS looks good, but perhaps a forward simulator would be more appropriate? I need to model multiple populations of roughly 100 individuals for around 100 generations and it would be useful if I could specify the starting amount of LD or genetic variation in each population, so that I can see the effect lack of genetic diversity at the start of the experiment has on the outcome. I know these are quite specific requirements but if anyone has any specific or general comments on this it would be most appreciated.
Many thanks,
Rubal
Good question! The pedigree info makes this tricky as family structure is often ignored.
Yes, I was thinking maybe a forward simulator would be the best hope for making use of pedigree information, but not sure if this is actually technically feasible with the current state of the art. I was thinking of approximating this by using inbreeding coefficients estimated from the snp data directly and or the pedigree; i've heard some simulators can take inbreeding coefficients as input. Alternatively perhaps I dont use the pedigree information for the simulations.
You should contact my colleague Chao Lai, who is not registered on this site, but who has thought about pedigrees and haplotypes. Send him the text above to chao dot lai at tufts dot edu.