Question

How to interpret XPEHH (recent selection) score?

6

Entering edit mode

8.9 years ago

eyb ▴ 270

I have calculated XP-EHH and iHS scores for a set of snps using selscan. XP-EHH ranges from -0.75 to 0.9. What do extreme values show? In the original publication they plot log(P-value). I think that P-values was calculated to show differences between xp-ehh and iHS. Does anyone have experience with selection scores?

selection selscan • 12k views

ADD COMMENT • link updated 8.9 years ago by Giovanni M Dall'Olio 28k • written 8.9 years ago by eyb ▴ 270

score 10 · Accepted Answer · 2015-12-22

10

Entering edit mode

8.9 years ago

Giovanni M Dall'Olio 28k

XP-EHH is a cross-population test for positive selective. It means that it detect SNPs that are under selection in one population but not in another. For example a SNP associated to resistance to malaria may be under selection in populations where malaria has been endemic but under neutral selection in other populations. It was developed because it is generally very difficult to identify signals of selection in a population, and comparing two populations may allow to identify weaker signals that are evident only after comparing with a closely related population.

The sign of the XP-EHH score indicates which of the two alleles is under selection, e.g. whether the ancestral or the derived allele. For practical reasons, people usually tend to ignore the sign and use the absolute XP-EHH score. This is because you may not always be sure about which SNP is ancestral in which population. Moreover taking the absolute score makes it easier to calculate mean by sliding windows.

In the publication they used -log(p-value), probably as a way to simplify the interpretation of the data in the plot. The value is usually generated by sorting the scores and taking the rank of them - e.g. see how I answered Zev in this discussion: C: A Database Of Signatures Of Selection In The 1000 Genomes Dataset . The p-value is then converted to -log(p-value) to facilitate the interpretation. For example a p-value of 0.01 becomes -log(0.01) = 2, so you can say that all SNPs with a -log(p-value) higher than 2 are significantly selected in one population.

ADD COMMENT • link 8.9 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

Thanks. I will try to calculate p-values this way.

ADD REPLY • link 8.9 years ago by eyb ▴ 270

0

Entering edit mode

Isn't the sign of the XP-EHH score indicative whether the SNP is under selection in your tested or reference population? I thought (depending on how you computed XPEHH) you would only either positive or negative values.

From the supp material:

An XP-EHH score is directional: a positive score suggests selection is likely to have happened in population A, whereas a negative score suggests the same about population B.

ADD REPLY • link 7.1 years ago by GabrielMontenegro ▴ 670

1

Entering edit mode

Yes, it is directional but in many cases you don't really know which is the ancestral allele or not. In that case it is safer to get the absolute score, and determine whether there is selection between the two populations, without knowing which is the allele selected. Moreover people tend to calculate the average XP-EHH score for a region, averaging or weighting the scores for multiple SNPs. In that case, if you don't use absolute score, the average for the region may be close to 0 because scores with different signs will cancel each other out.

ADD REPLY • link 6.2 years ago by Giovanni M Dall'Olio 28k

0

Entering edit mode

This is what I understand as well. I think the symbol for +ve or -ve in iHS that tells whether it ancestral/derived allele.

ADD REPLY • link 6.8 years ago by norfarhan.ma • 0

0

Entering edit mode

So, about Giovanni's answer, his comment about -log(p-value) higher than 2 to be significant is correct?

ADD REPLY • link 6.2 years ago by cmcouto.silva ▴ 60