Dear all, I designed a scipt (running from Jupyter notebook, Python3) that takes vcf files of choice (by some string filters) and generates their gzipped version after filtering using BCFTOOL view -i * -t** supplied by the user. When using an inclusion of depth e.g DP > 100, the script works and filters all instances <=100.
Example:
bcftools view -Oz /path/file.vcf -o /path/file.vcf.gz -i 'DP>100' -t 5:25000000-25001000
*This yield the expected outcome. A filtered VCF file.
However I get errors when trying FREQ > 0.1
instead:
Example
bcftools view -Oz /path/file.vcf -o /path/file.vcf.gz -i 'FREQ>0.1' -t 5:25000000-25001000
Wrong operator in string comparison: FREQ > 0.3 [(null),0.39%]
[E::main_vcfindex] unknown filetype; expected bgzip compressed VCF or BCF
[E::main_vcfindex] was the VCF/BCF compressed with bgzip?
I've tried many instances of this line but nothing worked (e.g. -i FMT/FREQ>0.1
).
Does someone know how to solve this issue? (I originally planned to run both inclusion together with an "&" operator.
Thanks!
Does your VCF even have the
FREQ
variable encoded? Do you not meanAF
(allele frequency)?Yes, Here is the format description in the vcf file: GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR
Can you follow-up on the answer given by sm.hashemin? My time is currently very limited.
Is FREQ defined in your VCF header? Can you paste a few sample variants?