Question

Interpreting XPEHH output

2

Entering edit mode

10.6 years ago

Rubal ▴ 350

Hello Everyone,

I would appreciate the opinions of anyone with experience running XPEHH. I have run this haplotype based selection test on phased whole genome SNP data to compare two recently split non-human populations. I am trying to detect regions of population specific selection.

I expected to see regions with a peak XPEHH score flanked by a decay in the score as the linkage breaks down. However occasionally I see very sharp peaks in the XPEHH score of only a MB in length (eg the peak at about 86Mb in the figure linked to below)

http://postimg.org/image/4ns7d9wyb/

Do people have suggestions about how to interpret these sharp peaks? My first thought is that it is the result of some kind of error in the SNP calling. Maybe it is from a population specific recombination hotspot but the populations only split ~100 generations ago so this seems unlikely. Any thoughts or questions are welcome regarding how to interpret such a plot when looking for selection with XPEHH.

Thanks in advance for your help,

Best regards,
Rubal

selection snps haplotype xpehh genome • 4.7k views

ADD COMMENT • link updated 3.2 years ago by Ram 44k • written 10.6 years ago by Rubal ▴ 350

0

Entering edit mode

Interesting question, thank you! Can you check the maf and daf of that snp? My first guess is that the minor allele is different in the two populations.

ADD REPLY • link 10.6 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

Popn1 is fixed for A allele and Popn2 has A allele at 15% freq.

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 10.6 years ago by Rubal ▴ 350

0

Entering edit mode

I guess that for the neighbor SNPs, the situation is the opposite: Pop1 has low frequency for the Minor Alleles, while pop2 has higher frequencies. I think that the peak is due to a problem with the definition of which is the Minor Allele in one of the populations. Consider that if you used the other allele as the Minor Allele, that SNP would have a score of about -1.5, and the peak would not look so isolated.

ADD REPLY • link 10.6 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

Ah good point, I will look into that. This makes a lot of sense. As the XPEHH is a haplotype based test shouldn't it avoid problems from this kind of inappropriate labeling?

ADD REPLY • link 10.6 years ago by Rubal ▴ 350

Ram · Answer 1 · 2014-04-29

0

Entering edit mode

10.6 years ago

Zev.Kronenberg 12k

You need to zoom on the sharp peak. How many SNPs are within that high score? This could be a problem with phasing.

ADD COMMENT • link updated 4.9 years ago by Ram 44k • written 10.6 years ago by Zev.Kronenberg 12k

0

Entering edit mode

That peak is composed of 5 SNPs

ADD REPLY • link updated 4.9 years ago by Ram 44k • written 10.6 years ago by Rubal ▴ 350