Question

How to set $TMPDIR in a parallel processing

0

Entering edit mode

10.0 years ago

lisadavic66 • 0

Hi all,

I run a batch job on a high-performance computing system to sort aligned reads and used GUN parallel to speed up my work, but my job failed with the following reason:

parallel: Error: Output is incomplete. Cannot append to buffer file in $TMPDIR. Is the disk full?
parallel: Error: Change $TMPDIR with --tmpdir or use --compress.

My job script looks like this:

#!/bin/bash

module load samtools/1.2

export TMPDIR=/scratch/$SLURM_JOBID

cd /data

ls *sam* | parallel "samtools sort -T /scratch/$SLURM_JOBID/{.} -O bam -o {}.bam {}"

Does anyone know how to solve this problem? Thank you in advance.

Best regards

Slurm GNU-parallel HPC • 9.9k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by lisadavic66 • 0

1

Entering edit mode

Have you checked that you're not using up all of the available scratch space?

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by pld 5.1k

0

Entering edit mode

Yes, I checked. I allocated 500 GB scratch space.

The main problem is: the parallel process Cannot append to buffer file in $TMPDIR as described in my post. I need to change the default tmpdir to a new one but I am not clear how to set it.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by lisadavic66 • 0

0

Entering edit mode

I meant did you actually check that it wasn't full while you were trying to run these jobs? Typically scratch spaces on HPC cluster nodes aren't dedicated to a user and anyone that's using that node can write to the scratch. So you might not have all of the space you need. I don't know that SLURM will check that there's 500gb of scratch available, I think it only checks that you haven't exceeded that amount.

$TMPDIR is being set with the line of code export TMPDIR=/.... You can change TMPDIR by changing /scratch/$SLURM_JOBID to whatever path you want to use. If they're temporary files however, the proper usage would be to use scratch as you are.

GNU parallel can run on multiple nodes, have you considered trying this?

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by pld 5.1k

0

Entering edit mode

Thank you for your comments and advice.

I haven't checked how much the space my job used, and I set only one node but I can try multiple nodes.

I have already tried export TMPDIR=/scratch/$SLURM_JOBID, but it doesn't work.

Based on the error information the parallel processing gave, I guess I haven't set TMPDIR correctly.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by lisadavic66 • 0

0

Entering edit mode

It is always a good idea to check the amount of resources your job used, that way you can make sure you're not hogging resources by over-allocating and that any crashes/etc are not due to running out of resources. You should also check to make sure that files (either temporary or final results) are being written to the correct places. You should never keep trying jobs and adding resources till it works, you're going to waste a ton of time.

The error is probably because parallel can't write to TMPDIR, the question is if it can't write there because it is full or because it cant find/access TMPDIR or that TMPDIR doesn't exist.

The reason to check the usage of your scratch space is that parallel will cache result files in a temporary directory. By default it is /var/tmp/, but if $TMPDIR is set, it will cache the result files there. So, it seems like both samtools and parallel are writing to /scratch/.../, which might mean you're using more space than you think.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 10.0 years ago by pld 5.1k

Ram · Answer 1 · 2017-04-11

Hello, I get the same message

parallel: Error: Change $TMPDIR with --tmpdir or use --compress

I tried adding --tmpdir /home/mydirectory to the script, but when I see the file /home/mydirectory/ doesn’t create anything, I see the default directory /tmp/ and the temporally files (like /tmp/pAh6uWuQCg, /tmp/opjhZCzAX4, etc.) are still created in that directory. For me it means that the parameter --tmpdir /home/mydirectory in the script does not work.

In my case I run the script of my app isolated and check the logs, and realize that my app was with error, for that the logs shows error messages and it was fill the file /tmp/

Maybe my experience will help you