Question

Is there any good and stable parallel implementation of Blast?

0

Entering edit mode

9.2 years ago

fajrian.yunus • 0

Hi Everyone,

I am new to this field. I am looking for good and stable parallel implementation of Blast. I am trying this one: http://salsahpc.indiana.edu/tutorial/hadoopblast.html

However, I found that this program simply assigns one task for each query input file, which makes me wonder what's the point of using Hadoop.

There is also http://archimedes.cheme.cmu.edu/?q=gpublast, but it only works with blastp, and there is a scathing rebuttal here https://larsjuhljensen.wordpress.com/2011/01/28/commentary-the-gpu-computing-fallacy/.

I am wondering if there is any good and stable parallel implementation of Blast. My employer is going to set up a computer cluster, and we might want to run Blast there. But by its nature, computer cluster is only beneficial if the program is parallel.

Thank you very much!

parallelization blast • 2.4k views

ADD COMMENT • link updated 2.2 years ago by Ram 44k • written 9.2 years ago by fajrian.yunus • 0

0

Entering edit mode

Pedantic comment: you need to distinguish between "parallel" and "distributed". Basic NCBI BLAST is already parallel on shared-memory systems and has been for as long as I can remember. (The GPU implementation will also only work on shared memory systems.) If you want parallelization across a cluster, splitting up the input by query sequence is almost certainly going to be the most efficient method (and the most common one).

ADD REPLY • link updated 2.2 years ago by Ram 44k • written 9.2 years ago by nathaniel.echols ▴ 30

0

Entering edit mode

I like to add, while BLAST itself has a parallelization option, I find its implementation rarely using the full resources available. Instead, it often uses only one CPU instead of the number of CPUs specified. If I remember correctly, this is due to the way, the searching is performed.

I strongly recommend splitting the input data as this allows the best parallelization (either on shared memory or distributed). By the way, this is exactly what Hadoop Map-Reduce Framework was developed for...

ADD REPLY • link 9.2 years ago by Manuel Landesfeind ★ 1.4k

0

Entering edit mode

And there is this: http://www.abokia.com/Products.htm

If you have the budget to purchase a commercial product then it may be an option.

ADD REPLY • link 9.2 years ago by GenoMax 147k