Entering edit mode
2.8 years ago
bio_elle
▴
10
I'm trying to use the -@ option on samtools sort to speed up the sorting of a file using multiple threads.
Using this line
samtools sort -n -@ 4 file.bam > file_sorted.bam
I get this error
[bam_sort_core] merging from 192 files and 4 in-memory blocks...
samtools sort: failed writing to "-"
Reading online I found that a solution could be using the -m option, giving each thread more memory, so I tried running this
samtools sort -n -@ 4 -m 2G file.bam > file_sorted.bam
Even with this option I get the same error, what could be the problem? Without the -@ the sort works but since the files I have range from 100 to 300 GB I'd like to speed up the process.
That error comes from a failure to write the final output file, in this case writing to stdout. The most likely cause I can think of is running out of disk space. So you could check to see if you have enough room for the tmp files and the end result.
It could be that stdout is being closed somehow, but if you are running from the command line I don't see how that could happen.
For the tmp files and the end result??
So if I have a 100GB file do I need to have 200GB of space (100 for the tmp files and 100 for the final product)??
That is double what I accounted for... if that is so then that is probably the problem. I am going to try again using the -@ option but I will delete the other sorted bam before doing so.
Could I use the -l option to compress the file? It will take longer but if it really is a memory problem then I shouldn't have a problem using multiple threads and it still might be enough to make the sort go faster
Basically, yes. samtools sort will put all the temporary files in the same location as it is run from. To put the the tmp files in /tmp you need to use the -T option.
The -m 2G option you used will let samtools sort use more memory for sorting but it will only be 8G total (2G per thread) so you still need at least 92GB of disk storage for the rest of the bam file in its partially sorted stage.
I don't know how much -l will help but you give it a try. Another option would be writing the result as a cram file which would give you better compression.
what is the output of
and what is the output of
in the very same directory ?
the only output of the
ls -lah
command is thefile_sorted.bam
file, there are no tmp files if that is what you're looking for, during the sort there were lots of them (I'm guessing 192 from the error) but disappeared after the sort finished with the error I showed.The remove_me_later.txt seems empty, it looks like a file of empty lines \n but nothing else.
ok, I was checking you had write permissions and supported files > 2G.