Dealing with GATK illegal character
0
1
Entering edit mode
4.3 years ago
MAPK ★ 2.1k

I am trying to run genomicsdbImport in GATK.

/gatk --java-options "-Xms4G -Xmx32G -DGATK_STACKTRACE_ON_USER_EXCEPTION=true" GenomicsDBImport \
--genomicsdb-workspace-path ${WORKDIR} \
       --batch-size 50 \
       -L chrX \
       --sample-name-map ${mylist.list} \
       --tmp-dir=${tmpPATH} \
       --max-num-intervals-to-import-in-parallel 10 \
       --reader-threads 16

I am supplying a list of gvcfs in the command above. I have gvcf file names with sample_name^Barcode^project_name (mylist.list). With this input, gatk is giving me illegal character error for input gvcf files because of ^. Is there a way to deal with this without renaming the input gvcf files? This is my mylist.list:

CAP_100       /dir/CAP_100^8036056040^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz       
CAP_28       /dir/CAP_28^8036056033^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz       
CAP_2       /dir/CAP_2^8036056474^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz

This is the error message I am getting:

A USER ERROR has occurred: Malformed URI java.net.URISyntaxException: Illegal character in path at index 49: /dir/CAP_100^8036056040^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz
gatk • 1.2k views
ADD COMMENT
1
Entering edit mode

You can try escaping the ^ characters, like so:

CAP_100       /dir/CAP_100\^8036056040\^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz       
CAP_28       /dir/CAP_28\^8036056033\^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz       
CAP_2       /dir/CAP_2\^8036056474\^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz

Using a caret in a filename was a bad idea. You're better off either renaming the files or creating softlinks with clean names to these files.

cd /dir/
ln -s CAP_100\^8036056040\^201909_SEY.aln.srt.isec-paddedexome.markd.recal.raw.snps.indels.g.vcf.gz CAP_100__8036056040.softlink.vcf.gz
ADD REPLY
0
Entering edit mode

Escaping character did not work. I will try with soft links.

ADD REPLY

Login before adding your answer.

Traffic: 2121 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6