Hello,
I'm trying to use the GATK to hard filter variants (I'm using a non-model organism, so variant re calibration isn't possible), but, while I'm following the GATK website's tutorial (https://gatkforums.broadinstitute.org/gatk/discussion/2806/howto-apply-hard-filters-to-a-call-set), I haven't actually been able to filter out any variants.
As an example, after subsetting out the SNP's in my GenotypeGVCFs produced VCF file, I used
gatk -T VariantFiltration -R /PATH/reference_genome -V myfile.vcf --filterExpression "MQ>20" --filterName "mq20_filter" -o my_filtered_file.vcf
which should have flagged any variants with mapping quality below 20 with FILTER
rather than PASS
. When I use grep to check if this worked in the way that I expected, I found many instances where MQ=10
.
There seems to be something missing with my JEXL expression that I'm simply not seeing in the tutorial, or other GATK documentation. Can anyone help me out?
Thanks!
edit: fixed a minor mistake
show us a line in the VCF failing the expression, and, in the header, the line for
##INFO=<ID=MQ...
please.A line from the VCF:
and I'm not quite sure what you mean by a line from the header, but I grepped your example and what returned was this:
Thanks!
I've encountered the same issue. Could you please provide your solution if you were able to fix this problem?