Entering edit mode
4.4 years ago
drowl1
▴
30
Hello,
I have a multi-sample VCF generated by GATK with an additional 'Allele fraction' (AF) annotation in the FORMAT field.
I wish to filter the SNPs in the VCF at the sample genotype-level using the AF in the FORMAT field and I want to remove sites with missing values (".") but without removing the whole variant.
I have tried the command GATK SelectVariant --select 'vc.hasAttribute("AF")' but that doesn't work.
Does anyone know of a simple way or a tool that could accomplish this?
Suggestions highly appreciated!
Hi Could you please post an example of your table and the output you want to obtain after filtering.
Hi,
I figured it out eventually. So the reason why I thought to remove sites with missing values was because when I tried a 0.5 filter threshold with the command - bcftools view -e 'FORMAT/AF[*] => 0.5', the whole variant was excluded for all samples even if only one of the sites failed the threshold or had missing values.
The command bcftools view -i 'MIN(FMT/AF)>=0.5' worked well by keeping the whole variant and just excluding sites that failed the threshold.
Thanks!