I have short sequencing reads from a pool of 30 progeny from two diploid parents, in the progeny at any particular site I am expecting to see either:
- Two alleles (suggesting at least some of the progeny are heterozygous)
Or:
- One allele (suggesting all progeny are homozygous at this site)
Also, without going into detail, it is extremely unlikely that there are recombinants in the progeny - so I'll be expecting a 1:1 (or 1:3) ratio of alleles - when there is a second allele it will be present in high frequency.
I realise there are inherent problems when pools of individuals are sequenced, but given a lack of recombination and an expectation of seeing either one allele or two alleles in the progeny, can the genotype calls (0/1 or 1/1) in the VCF produced using samtools be used for showing whether it is one or two alleles?
I just need to know whether there is support for there being two alleles present within the pool or just one, but I am unsure of how this field is calculated.
Thanks