Hello, everyone !
Studying STAR manual, I learned that multiple samples can be mapped at once with parameters.
For paired-end reads,
--readFilesIn sample1read1.fq,sample2read1.fq sample1read2.fq, sample2read2.fq
But I have done multiple mapping with "for" loop until now. That is, mapping have been done one by one.
Multiple samples mapping at once using parameter
vs
Multiple samples mapping one by one using "for" loop
if I use the same number of Threads, Which way is more efficient?
I think the STAR developer is best positioned to answer this, as he'd have run tests (most probably). In any case, I think a gain will be caused by the genome loaded in shared memory in the former use case, although that could be enforced in the latter case too.
The latter case, when modified to run one sample per node, allows for better parallelization. A loop is the least efficient way to do things IMO.