I am trying to filter multi-allelic sites in a VCF file to find the true multi allelic sites. Some of these have 1 read in 200, that I want to get rid of. The other alternative looks good. However, when I try to filter the alleles lower than frequency (0.05) with vcftools
it gets rid of all variant not just the minor allele. Is there a way to filter out the minor alternate according to the frequency but not the variant completely?
Check out
bcftools view -i/-e
expressions. The boolean operators might be helpful in navigating your niche requirements.I tried but both
bcftools
andvcffilter
, they put a cutoff value, then exclude the site all together, I want to keep the site and get rid of the extra allele. This is for poolseq analysis, that's why the read depths matter.If it is an uncommon operation, you might need to do some manipulation with
awk
. You could also look for bcftools plugins, but I'm not sure if there's some sort of a library of plugins you could search.