Blast - on multi processor computer
1
0
Entering edit mode
9.8 years ago

A short question on local blast algorithm. In its multi-thread mode, is it better/faster to run the program on a cluster of multiple nodes (lets say 4 nodes, each having a 16 threaded processor) or to have a single multiprocessor motherboard (also 4 processors of the same type, for example). I understand from different discussions that programs like GNU/parallel fully parallelizes the processes in the later case (single multi-processor plate) also in parts when blast does not run in multi-threaded mode and can thus speed up the process substantially.

All your opinions are very welcome.

sequencing sequence blast • 3.1k views
ADD COMMENT
1
Entering edit mode
9.8 years ago
lelle ▴ 830

I seem to remember that I read somewhere that blast itself does not speed up significantly above 8 processors.

What I have seen most people do is splitting up the input file (assuming you have many query sequences) and starting multiple blast jobs with a low (<=8) number of processors per job. These jobs can also easily be split up over multiple nodes.

ADD COMMENT
0
Entering edit mode

Agreed. but change that 8 down to one or two. The blast system doesn't really parallelize because it's bottlenecked at the memory bus and I/O. You can fire up multiple sub-jobs manually and see how much speedup your system can do.

ADD REPLY
0
Entering edit mode

Thank you. So, a multiprocessor motherboard, with each processor having its own RAM does not necessarily our perform a small cluster of few nodes for BLAST. Theoretically, I saw a mother board with 4 processors (each having let's 8-16GB RAM) something like a "cluster" of 4 nodes (each having 1 CPU and 8-16RAM) to which the jobs could be submitted in parallel (using the right software). All the communication could be done faster since it was on one board. Was I completely off?

ADD REPLY
0
Entering edit mode

You're right that communication could be faster, but these blast threads are not communicating. This setup becomes problem specific. If the subtasks needed to communicate, then you could get a bonus by being local. These blast threads will be separate entirely and instead interfere with each other's memory access requests. Each node will support only a few jobs. I've seen best results using 2 cores on 20 nodes, vs poor results using 2 nodes of 20 cores each. Might also be related to reading and writing large files.

ADD REPLY

Login before adding your answer.

Traffic: 2267 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6