Interpreting XPEHH output
1
2
Entering edit mode
10.6 years ago
Rubal ▴ 350

Hello Everyone,

I would appreciate the opinions of anyone with experience running XPEHH. I have run this haplotype based selection test on phased whole genome SNP data to compare two recently split non-human populations. I am trying to detect regions of population specific selection.

I expected to see regions with a peak XPEHH score flanked by a decay in the score as the linkage breaks down. However occasionally I see very sharp peaks in the XPEHH score of only a MB in length (eg the peak at about 86Mb in the figure linked to below)

http://postimg.org/image/4ns7d9wyb/

Do people have suggestions about how to interpret these sharp peaks? My first thought is that it is the result of some kind of error in the SNP calling. Maybe it is from a population specific recombination hotspot but the populations only split ~100 generations ago so this seems unlikely. Any thoughts or questions are welcome regarding how to interpret such a plot when looking for selection with XPEHH.

Thanks in advance for your help,

Best regards,
Rubal

selection snps haplotype xpehh genome • 4.7k views
ADD COMMENT
0
Entering edit mode
Interesting question, thank you! Can you check the maf and daf of that snp? My first guess is that the minor allele is different in the two populations.
ADD REPLY
0
Entering edit mode

Popn1 is fixed for A allele and Popn2 has A allele at 15% freq.

ADD REPLY
0
Entering edit mode

I guess that for the neighbor SNPs, the situation is the opposite: Pop1 has low frequency for the Minor Alleles, while pop2 has higher frequencies. I think that the peak is due to a problem with the definition of which is the Minor Allele in one of the populations. Consider that if you used the other allele as the Minor Allele, that SNP would have a score of about -1.5, and the peak would not look so isolated.

ADD REPLY
0
Entering edit mode

Ah good point, I will look into that. This makes a lot of sense. As the XPEHH is a haplotype based test shouldn't it avoid problems from this kind of inappropriate labeling?

ADD REPLY
0
Entering edit mode
10.6 years ago

You need to zoom on the sharp peak. How many SNPs are within that high score? This could be a problem with phasing.

ADD COMMENT
0
Entering edit mode

That peak is composed of 5 SNPs

ADD REPLY

Login before adding your answer.

Traffic: 2884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6