GATK 3.8.0 VariantFiltration ERROR input string
1
1
Entering edit mode
4.2 years ago
rturba ▴ 10

Hello,

I am running GATK 3.8 on a Linux cluster. I am trying to run the tool VariantFiltration with the following commands:

java -jar $GATK/GenomeAnalysisTK.jar \
    -T VariantFiltration \
    -R $ref \
    -V $wd/vcf/filtered/stick84_SNPs_GATK.vcf \
    --filterExpression "QD < 2.0" --filterName QD2 \
    --filterExpression "FS > 60.0" --filterName FS60 \
    --filterExpression "MQ < 40.0" --filterName MQ40 \
    --filterExpression "MQRankSum < -12.5" --filterName MQRS-12.5 \
    --filterExpression "ReadPosRankSum < -8.0" --filterName RPRS-8 \
    -o $wd/vcf/filtered/stick84_SNPs_filtered_GATK.vcf

However, it is throwing me this error:

ERROR MESSAGE: For input string: "nan"

I don't know what this error means and how to fix it. All the potential help I find online sends me to ghost pages of GATK that I cannot access :(

Please, any help will be greatly appreciated! Thank you and take care!

genome snp software error • 1.4k views
ADD COMMENT
1
Entering edit mode
4.2 years ago
rturba ▴ 10

Ah, I've found out the reason for that here: https://github.com/broadinstitute/gatk/issues/5582. I hope this helps others as lost as me! I did not realize the INFO fields could have "NaN" values. Solution to fix is:

bcftools view in.vcf.gz |
sed 's/=nan/=NaN/g'  |
bgzip > fixed.vcf.gz
ADD COMMENT
1
Entering edit mode

I moved this to answer so you can accept it (green check mark) to provide closure to the thread.

ADD REPLY

Login before adding your answer.

Traffic: 2264 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6