Question

pooled-heterozygosity calculation

0

Entering edit mode

3.5 years ago

reza ▴ 300

As Rubin et al, one method of selection signature identification in a genome-scale study is pooled heterozygosity (Hp) calculation.

“Hp = 2ƩnMAJƩnMIN/( ƩnMAJ + ƩnMIN)^2, where nMAJ and nMIN are the numbers of reads corresponding to the most and least abundant allele, respectively, the sum of theses parameters calculated for SNPs in a defined window (40kb for example) across the genome. Then Hp is Z-transformed.

Many articles use this method but unfortunately, I cannot find any command to do this in the supplementary materials. Also, Google did not answer my question after a lot of searching.

My question here is actually the third question on this issue, but the previous two questions have unfortunately not been answered. I hope this time with your help I can find an answer to this question.

How can I get nMAJ and nMIN from a multi-sample VCF file (produced by GATK) and then calculate Hp?

Thanks in advance

VCF Population-Genetics WGS Selective-sweep • 1.5k views

ADD COMMENT • link updated 22 months ago by wonde2000 • 0 • written 3.5 years ago by reza ▴ 300

0

Entering edit mode

Hi, I'm having the exactly same issue here. I want to calculate the pooled He using Rubin's method. I have tried to find the script how to calculate it in many paper but could not find any. Now I am trying to write a function to calculate it, but my concern is whether I'm doing it properly.

I wonder how did you cope with it in the end. Thank you anyways posting this question so that I know I'm not the only one having the problem.