select only deletions in vcf file
1
0
Entering edit mode
2.8 years ago
alexmondaini ▴ 20

Apparently the Broad Institute discourages people from writing their own scripts/parsers for vcf files. https://gatk.broadinstitute.org/hc/en-us/articles/360035531692-VCF-Variant-Call-Format , quoting from the link:

"No, really, do not write your own parser if you can avoid it. This is not a comment on how smart or how competent we think you are -- it is a comment on how annoyingly obtuse and convoluted the VCF format is. "

So I've used their own gatk tools to select variants of my interest, problem is to filter only for INDELS my vcf output remains with insertions, and I want only deletions.

This is my GATK command:

gatk SelectVariants -V ~{vcf} -O ~{output_vcf} --select-type-to-include INDEL

Does anyone know other tools that may select only for deletion or is gatk capable of doing that ? I read the tool documentation and apparently it cannot perform this task.

vcf gatk • 1.3k views
ADD COMMENT
1
Entering edit mode
2.8 years ago

using vcffilterjdk: http://lindenb.github.io/jvarkit/VcfFilterJdk.html

java -jar ${JVARKIT_DIST}/vcffilterjdk.jar -e 'return variant.getAlternateAlleles().stream().filter(A->!A.isSymbolic()).anyMatch(A->A.length()< variant.getReference().length());'  ./test/resources/rotavirus_rf.vcf.gz

(...)
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
RF02    1962    .   TACA    TA  33.43   .   AC=1;AN=10;DP=43;DP4=22,11,2,0;HOB=0.02;ICB=0.0439024;IDV=3;IMF=0.3;INDEL;MQ=60;MQ0F=0;MQSB=1;SGB=0.810227;VDB=0.373246
RF04    1259    .   ATTT    ATT 41.02   .   AC=1;AN=10;DP=39;DP4=8,6,2,1;HOB=0.02;ICB=0.0439024;IDV=6;IMF=1;INDEL;MQ=60;MQ0F=0;MQSB=1;SGB=2.33665;VDB=0.869183
RF04    1857    .   CAGA    CA  39.47   .   AC=1;AN=10;DP=45;DP4=12,21,1,1;HOB=0.02;ICB=0.0439024;IDV=2;IMF=0.166667;INDEL;MQ=60;MQ0F=0;MQSB=1;SGB=0.810227;VDB=0.969947
RF06    1129    .   ATTT    AT  78  .   AC=2;AN=10;DP=31;DP4=0,20,0,5;HOB=0.32;ICB=0.425;IDV=5;IMF=0.833333;INDEL;MQ=60;MQ0F=0;SGB=5.5074;VDB=0.344193
RF11    74  .   CAAAAA  CAA 113 .   AC=1;AN=10;DP=82;DP4=55,0,5,0;HOB=0.02;ICB=0.0439024;IDV=7;IMF=0.388889;INDEL;MQ=60;MQ0F=0;SGB=5.5074;VDB=0.00911888
(...)
ADD COMMENT

Login before adding your answer.

Traffic: 1632 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6