I have called variants from a merged bam with 18 individuals using bcftools.I am actually looking at calculating mean heterozygosity and would like to know the count of the non-ref alleles, the number of chromosomes at each site(basically the number of individuals in which the variant was called).For example, I notice the following
Contig0.1 1055 . T A 24.1 . DP=175;VDB=0.0007;AF1=0.03972;G3=0.9375,7.614e-07,0.0625;HWE=0.0252;AC1=1;DP4=71,95,1,3;MQ=33;FQ=24.3;PV4=0.64,0.00095,0.0012,0.0083
GT:PL:DP:GQ 0/0:0,48,255:16:59 0/0:0,33,245:11:44 0/0:0,42,255:14:53 0/0:0,21,153:7:32 0/0:0,45,255:15:56 0/0:0,24,187:8:35 0/0:0,36,249:12:47 0/0:0,36,206:12:47 0/0:0,0,0:0:11 0/0:0,45,255:15:56 0/0:0,18,151:6:29 0/0:0,6,36:2:17 0/0:0,39,255:13:50 0/0:0,42,255:14:53 0/0:0,45,255:15:56 0/0:0,0,0:0:11 0/0:0,18,162:6:29 0/1:70,12,0:4:6
I am considering the samples with the following '0/0:0,0,0:0:11' as no calls.In case of columns like this 0/1:70,12,0:4:6,how is this considered a 'het' when we have further lower PL values?It has very low read depth and GQ too.I am assuming the allele count is also defined based on this.Any suggestions on how to deal with such sites would be appreciated.