Entering edit mode
13 months ago
selplat21
▴
20
Hello,
I am trying to generate a haploid consensus sequence based on a VCF file. For sites which are heterozygous, I want to randomly choose one of the alleles. I don't want to always choose reference and I don't want to always choose ALT as this will result in reference bias in this case.
Typically I would include run something like this:
bcftools consensus -f Input.vcf.gz --absent N -H I #IUPAC codes at heterozygous sites
then would randomly choose alleles with seqtk randomsamp, but I have indels that I would like to consider.
Any guidance on randomly sampling from these het sites would be appreciated.