Hi, I am using bwa mem to map illumina pair-end reads to the rat genome, and then using samtools sort to sort the bam file. The bam file is about 15Gb. However, it appeared that over millions of temp files have been generated during samtools sort step and it cannot be completed because it reached the max directory limit of the school computer. I checked the temp files, and it looked like only one read was recorded in each file. That doesn't look right. Does anyone know what could be the problem? Thanks a lot!
you should start by adding the command that you used
I would guess that you forgot to set the -m parameter appropriately. It defines the amount of memory to be used at max before spilling the data to a tmp file. Check if you defined the unit correctly. Simply typing
-m 1
would be 1 Byte of memory I think (thus explaining the abundant number tmp files as basically every read gets its own file), but you need something like 1G for appropriate performance.Actually $MEM is set to 8G in the script. Assigning the memory via a variable must be causing some strange problem here (as linked by @tonor in one of the threads above).
Yeah, it is probably the variable. I had the same issue on our CentOS server. Setting
-m
in a variable always caused trouble there, but the same script on OS X Mavericks worked fine.Can you provide the command you used to sort with samtools?
samtools sort
does produce multiple temp files butmillions
is not normal.Here is the command:
REF
,FASTQ1
,FASTQ2
,OUTDIR
were defined elsewhere:What version of samtools are you using?
version 0.1.18 thanks!
Eeek! that is an ancient version of samtools. Please consider upgrading to the latest. Not sure why samtools put one read per file in temp files.
Definitely worth upgrading - if you can't - this post seems related:
[Samtools-help] samtools sort creates millions of files https://sourceforge.net/p/samtools/mailman/samtools-help/thread/638D9B69-C6AA-40E9-8E3E-D2F20407471D@bx.psu.edu/
Suggests it is to do with the -m option, if you manually run the sort command without any $MEM shortcuts does it work
OK. I will try on a newer version. I just found out we do have samtools/1.3.1, but 0.1.18 is the default setting.... Thanks for the suggestion.
Also - what operating system is your school computer?
Red Hat Enterprise Linux 6.x