I have BCFs with over a hundred individuals. I want to filter the files so that any genotype call with a max(GP) < 0.9 is removed. I don't want the whole site removed, I just want that individual genotype data removed for that site. I can't figure out how to do this without doing on each BCF individually.
Any suggestions?
I believe that when you say you want to remove genotype data, you mean that you want to make it missing. BCFtools filter can help you with that. You can try using this command (try using the latest version from Github):
This would include all genotypes that have a GP > 0.9 and covert others to missing. This rule is applied to all individuals. Furthermore, you could also try filtering based on Genotype quality (GQ) which is phred scaled.
This does not work.
bcftools filter -i 'FMT/GP[1-1699]>=0.9' --set-GTs . Seventh_imputations_1240K.vcf.gz
still outputs problems like this0|0:0.669,0.327,0.004
it's not clear to me. Give us a short example of input/output.