I sorted my RNA-Seq bam files after mapping by star, and here is my key command line in a for cycle bash file:
samtools sort $file -o ${filename}.sorted.bam
At last, the one error file shows like:
[bam_sort_core] merging from 6 files and 1 in-memory blocks...
[bam_sort_core] merging from 4 files and 1 in-memory blocks...
[bam_sort_core] merging from 8 files and 1 in-memory blocks...
[bam_sort_core] merging from 9 files and 1 in-memory blocks...
[bam_sort_core] merging from 7 files and 1 in-memory blocks...
[bam_sort_core] merging from 4 files and 1 in-memory blocks...
[bam_sort_core] merging from 5 files and 1 in-memory blocks...
[bam_sort_core] merging from 3 files and 1 in-memory blocks...
[bam_sort_core] merging from 6 files and 1 in-memory blocks...
[bam_sort_core] merging from 13 files and 1 in-memory blocks...
[bam_sort_core] merging from 8 files and 1 in-memory blocks...
[bam_sort_core] merging from 2 files and 1 in-memory blocks...
...
Something wrong here? I searched but still don't know why, so I'm not sure these sorted files be used for next step?
What's more, I have M(like 63) files, and there M-N(like 6, yes, few) lines, it means not all file will meet this problem? It makes me more confused.
Any ideas will be appreciated!!!
If you have the memory, you can reduce the number of temporary files by increasing the default memory usage from 768Mb to, say, 2G using the
-m
option, e.g. samtoolssort -m 2G -o out.bam in.bam
. Be sure to never use something like-m 2
rather than-m 2G
as this would set the memory limit to 2 bytes resulting in thousands of tmp files, eventually crashing the system.Got it, many thanks!
so is it really an error, and anyone knows to set -@ and -m, which is more useful when dealing with hundreds of bams at the same time in cluster, thanks a lot
Please use
Add Comment
for comments. As Istvan explained, these are just status messages, neither errors nor warnings. Set -@ and -m as you like, but these are options that still deal with one file at a time. If you want things parallelized, have a look at GNU parallel, like:This command will sort all BAM files in your current directories, 8 at a time with 2 cores and 2GB of memory per core each.
By the way, will this "error" here lead to give wrong result(${filename}.sorted.bam)?