How to give multiple input files to the gatk ?
0
0
Entering edit mode
3.3 years ago
Zahra ▴ 110

Hi all,

I want to use gatk-package-4.2.0.0 for updating my VCF dictionary. I use the script below for just one VCF file, and it has worked so far.

 java -jar gatk-package-4.2.0.0-local.jar  UpdateVCFSequenceDictionary 
-V /PATH _TO_INPUT_FILE.vcf --sequence-dictionary /PATH_TO_DICTIONARY.dict -O /PATH_TO_OUTPUT_FILE.vcf

Now I have multiple input files with different names. How should I change my script to update the dictionary of all files?

Thanks for any help

dictionary vcf gatk • 1.1k views
ADD COMMENT
1
Entering edit mode

Use a for loop? Something like:

for i in *.vcf
  do
    name=$(basename ${i} .vcf)
    java -jar gatk-package-4.2.0.0-local.jar  UpdateVCFSequenceDictionary -V /PATH _TO/${name}.vcf --sequence-dictionary /PATH_TO_DICTIONARY.dict -O /PATH_TO/${name}_out.vcf
  done
ADD REPLY
1
Entering edit mode

This would just run the command on each of your vcf files individually. If what you want is to run one command on multiple input files together, you can either use the -V parameter multiple times (like -V one.vcf -V two.vcf -V three.vcf ) or use a text file listing all your inputs (one file path per line). I’m a little fuzzy on the exact syntax for that option but it may be as simple as -V my_inputs.txt

This should be documented in the -V parameter paragraph in the tool doc, or you can always ask in the GATK forum, they are super helpful (disclaimer: I used to run the GATK forum)

ADD REPLY

Login before adding your answer.

Traffic: 2119 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6