I have a vcf file with structural variants (mainly deletions), and I want to filter out only deletions with a specific size (>10kb for example) and write those deletions to a separate vcf file. How can I do this?
I have a vcf file with structural variants (mainly deletions), and I want to filter out only deletions with a specific size (>10kb for example) and write those deletions to a separate vcf file. How can I do this?
Similar question answered already here: Filtering Gatk Indel Output Length 20+
Hi! I just was looking for this, and since is the top result in google search I found it useful to answer with the current GATK version (4.2.6.1).
What worked for me is this commands:
gatk SelectVariants \
-V input.vcf \
--select-type-to-include DEL \
--min-indel-size 10000 \
-O output.vcf
Those, and more, arguments can be found in: https://gatk.broadinstitute.org/hc/en-us/articles/5358856605339-SelectVariants#--variant
I used the docker image for GATK. I think it works fine and saves a lot of time, this is the tutorial I followed: https://gatk.broadinstitute.org/hc/en-us/articles/360035889991--How-to-Run-GATK-in-a-Docker-container#article-comments
Quick disclaimer: I'm new to bioinformatics so maybe there is some other way to do this filter using other tools, but I found GATK to be the most easy and straightforward.
Hope it helps!
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you very much!