genomicdbImport issue
0
0
Entering edit mode
3.6 years ago

Hello everyone,

I need some help, I have around 50,000 WES gvcfs which I am trying to merge using (gatk genomicdbimport tool) and do the Multi-sample calling later. even though I am using the biggest node on our cluster the merging is extremely slow. it takes almost 5-6 days to merge a batch of only 5000 samples. I was just wondering if you guys have a solution to this problem. I tried different things but no success so far.

gatk --java-options "-Xmx250g -Xms250g" GenomicsDBImport --genomicsdb-workspace-path $SCRATCH/database/UKB_Database --batch-size 5000 -L $SCRATCH/interval.list --sample-name-map $SCRATCH/cohort.sample_map.txt --tmp-dir $SCRATCH/temp/ --reader-threads 6

gatk --java-options "-Xmx250g -Xms250g" GenomicsDBImport --genomicsdb-update-workspace-path $SCRATCH/database/Batch1 --batch-size 5000 --sample-name-map $SCRATCH/batch0.txt --tmp-dir $SCRATCH/temp/ --reader-threads 6

these are the commands that I am using, Since I need some selected regions so in the first command I am passing an interval list in .bed format..

looking forward to hearing from you. thanks in advance.

Kind Regards, Haider

genomicdbImport gatk • 958 views
ADD COMMENT
1
Entering edit mode

split your $SCRATCH/interval.list per chromosome

ADD REPLY
0
Entering edit mode

Hi, Thanks it worked..

ADD REPLY

Login before adding your answer.

Traffic: 2078 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6