bcftools view error: the tag "mis" is not defined in the VCF header
1
1
Entering edit mode
2.7 years ago
skylerz ▴ 10

I am new to bioinformatics, so sorry if this question seems too simple.

I am trying to perform quality control on a vcf file using the following code:

$ bcftools view -i 'F_PASS(DP>=10 & GT!="mis")> 0.9' norm.vcf.gz -Oz -o filtered_norm.vcf.gz

or this:

$ bcftools view -i 'F_PASS(DP>=10 & GT!=".")> 0.9' norm.vcf.gz -Oz -o filtered_norm.vcf.gz

The ideal filtered_norm.vcf.gz should have at least 90% of its genotype for a variant with a DT >= 10 and no missing GT. However, entering either these codes and the system will respond "Error: the tag "mis" is not defined in the VCF header."

I checked the vcf header and it has this line:

##FORMAT=<ID=GT,Number=1,Type=String,Description="">

But I do not know how to proceed from here to solve the problem. Could anyone help me, please? Thank you very much in advance!

bcftools • 2.6k views
ADD COMMENT
0
Entering edit mode

what's your version of bcftools ?

ADD REPLY
0
Entering edit mode

I checked and it is bcftools 1.15.

ADD REPLY
0
Entering edit mode

Hi! Did you find a solution to this problem? I'm running into the same issue with bcftools 1.15.1

ADD REPLY
1
Entering edit mode
2.1 years ago
lb61 ▴ 10

From the filter you're trying to implement, I'm guessing you were also following the guidelines for UKB WES Filtering for Genotype-Phenotype Association Analyses. If so, were you by chance trying to run this code through the swiss-army-knife app? That's how I ran into this issue, and I resolved it by escaping the double quotes around "mis." What worked for me was:

CMD_STRING="bcftools view --threads 4 -i 'F_PASS(DP>=10 & GT!=\"mis\")> 0.9' -Oz input.vcf.gz > output.vcf.gz"

dx run swiss-army-knife -iin="filepath/input.vcf.gz" -iin="filepath/input.vcf.gz.tbi" -icmd="$CMD_STRING" --destination "output_path"

Without the escaped double-quotes, I saw in the job log that the command string was being parsed as:

bcftools view --threads 4 -i 'MAF<=0.001 && MAC >=1 && F_MISSING<0.1 && F_PASS(DP>=10 & GT!=mis)> 0.9' -Oz

Note that the double quotes around mis are missing, which is what caused the tag not defined error. Hope this is helpful for anyone else running into this problem in the future!

ADD COMMENT

Login before adding your answer.

Traffic: 1518 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6