GATK jointgenotyping file locking and too many open files woes
1
2
Entering edit mode
9.4 years ago
nchuang ▴ 260

Hi guys,

this has been plaguing me for weeks. For some reason it wasn't a major issue when I first ran it, but now it just won't run. Here is my problem:

I first split them by chromosome to help with parallelism.

  • I first had around 70 human WGS to jointgenotype before VQSR. Ran it with GenotypeGVCF with Mem=32G and NT=8
  • Then I had 130 human WGS to jointgenotype before VQSR. Ran it with Mem=64G and NT 8. I would get a couple file locking errors and sometimes random *.idx file could not be read. I was still able to run 4 chromosomes through at a time.
  • Then I had 101 human WGS to jointgenotype. Could not get it to run without Mem=512G and NT 8. I would only be able to run 2-3 Chr at once. Same error messages.

Now I am trying to run about 125 human WGS for jointgenotyping and I can't even run one job with Mem=512G and NT 8. I tried to scale down as low as 64G/4cores and many combinations in between. I am getting more frequent file locking and index file can't read errors. I thought about using CombineVariants tool but it would take about 4 weeks to combine them all.

All of these were run with GATK3.4. I asked their forum and they told me to "look into Queue" which looks like a lot of work to learn just to get past the jointgenotype step.

Please help! I've been stuck on this for weeks!

Edit: So I used the disable auto file creation and locking argument which resolved file locking errors but now I have "Too many files open" errors.

GATK • 4.4k views
ADD COMMENT
1
Entering edit mode
9.4 years ago
muwenbo.prc ▴ 10

Try this parameter --disable_auto_index_creation_and_locking_when_reading_rods.

Here is more detailed discussion.

ADD COMMENT
0
Entering edit mode

I literally just found that too! thanks so much. I hope it will work. I can't believe it took me over a month to discover this because I was querying "file lock error" instead of "unable to lock file"

ADD REPLY
0
Entering edit mode

Actually now I am getting a "Too many open files" error. I checked that I have unlimited files I can have open so I don't know why I am getting this error.

ADD REPLY
0
Entering edit mode

Could you share the command line of your GATK jointgenotyping?

ADD REPLY
0
Entering edit mode

So this is just an example and I forgot how to use a list to call variants so I just type out all the variants in the command line. This is for Chr X.

java -jar -Xmx512G /GATK3.4/GenomeAnalysisTK.jar -T GenotypeGVCFs \
    -R /local/projects-t3/1000G/human_g1k_v37_decoy.fasta \
    --variant /local/projects-t3/1000G/HG01789.variants.vcf (--variant x100 more times) \
    --disable_auto_index_creation_and_locking_when_reading_rods \
    -L X -nt 8 -o /local/projects-t3/1000G/GBR.X.vcf
ADD REPLY

Login before adding your answer.

Traffic: 2203 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6