Hi everyone,
I'm trying to use GATK to hard filter variants but, while I'm following the GATK website's tutorial, I haven't actually been able to filter out any variants. I want to filter based on the VQSLOD flag.
I tried to perform the filtering using GATK 3.8 or 4.1, but systematically, any variants is filtered. I have no error output.
GATK 3.8 version:
java -jar /home/maintenance-gg/Téléchargements/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar \
-T VariantFiltration \
-R /home/maintenance-gg/Documents/Reference_genome/Pfalciparum.genome.fasta \
--filterName LowQualVQ -filter "VQSLOD <= 0.0" \
--variant /home/maintenance-gg/Documents/VCF2/SNPs.vcf \
-log /home/maintenance-gg/Documents/VCF2/filtration.txt \
-o /home/maintenance-gg/Documents/VCF2/SNP_filtered5.vcf
GATK 4.1 version:
gatk VariantFiltration \
-R /home/maintenance-gg/Documents/Reference_genome/Pfalciparum.genome.fasta \
-V /home/maintenance-gg/Documents/VCF2/calling_GVCF.vcf \
--filter-name LowQualVQ -filter "VQSLOD <= 0.0" \
-O /home/maintenance-gg/Documents/VCF2/SNP_filtered5.vcf
Can anyone help me out? I have search for a solution on biostars and GATK support, but I don't found a solution to my problem... I just know that GATK's filter expressions couldn't take integers, and they needed doubles.
Here are the INFO line of VQSLOD and an example SNP line of my VCF before filtration.
INFO:
##INFO=<ID=VQSLOD,Number=1,Type=Float,Description="Log odds of being a true variant versus being false under the trained gaussian mixture model">
SNP example:
Pf3D7_01_v3 176 . G A 107.14 PASS AC=2;AF=1.00;AN=2;DP=5;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=34.35;QD=26.79;SOR=3.258;VQSLOD=3.39;culprit=SOR GT:AD:DP:GQ:PL 1/1:0,4:4:12:121,12,0 ./.:0,0 ./.:6,0:6 ./.:0,0 ./.:5,0:5 ./.:0,0 ./.:0,0 ./.:3,0:3 ./.:20,0:20 ./.:0,0 ./.:8,0:8 ./.:0,0 ./.:5,0:5 ./.:0,0 ./.:0,0 ./.:0,0 ./.:0,0 ./.:0,0
Please let me know if you need any additional information.
Thanks!
Hi, thanks for your reply. The name of my filter is wrong, you're right; but even if I execute the command as you propose, there is no difference, any variants were filtered.