Does anybody has an experience in running blastn (nucleotide sequence search) on GPU? The freely and easily obtainable tools only provide blastp that is for proteins. I need nucleotide search. As the tool must at least search both direct and reverse complement sequence, just feeding nucleotide sequences to blastp may not be the optimal choice.
Apologies if this is not relevant as I am still very new in this field and trying to get my head around it, but have you seen this paper which provides a GPU optimised BLASTX alternative:
(Rice.fasta and q.fasta are nucleotide database and query in FASTA format, and q has direct matches in Rice.fasta - verified with grep). No output is ever received. Even more, if I also specify -D 0 (GPU device ID), it always fails with error:Out of GPU memory (the card has 1 Gb so should not be that soon). But it also fails with exactly the same message if I specify a different GPU id of non-existing device.
If the authors/supporters could post any comments, would be great.
Yongchao Liu, Douglas L Maskell and Bertil Schmidt, "CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units", BMC Research Notes 2009, 2:73
It builds from the source and works with nucleotide sequences, also it is relatively easy to start using (takes FASTA sequences without indexing). However also the new 2.0.8 version seems choking with sequences over somewhat 64 Kb in the input database. Also the total length of the input database appears to be limited.
I have implemented the loop over sequence database so not to feed all 10 Gb at once and added a filter to drop all sequences over 64 Kb (anyway, this is just evaluation). With all these alterations, on the middle range Ti 560 GPU it runs more or less like a single i7 3960 thread, but that CPU has six cores. To be precise, a single CPU thread needs 1 min 54 seconds' to search for sixteen 25 bp sequences in 1.8 Gb.p database. The GPU requires 1 min 23 seconds for this search. It is more or less the same time if I place the database into RAM drive so this is not just a limitation of the hard drive.
Probably a high end card (GTX 690 has about six time more cores) would perform same as a whole i7 3960 CPU that is also high end. The only obvious benefit seems that I can add maybe even two GPU cards to my desktop while I cannot add one more CPU.
This paper describes GPU-Blast, which unfortunately is designed for aligning protein sequences and not nucleotides.
Panagiotis D. Vouzis and Nikolaos V. Sahinidis, "GPU-BLAST: using graphics processors to accelerate protein sequence alignment," Vol. 27, no. 2, pages 182-188, Bioinformatics, 2011 (Open Access). Available at: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3018811/
Maybe it can be used to align nucleotide sequences as well, it should not be much different. You should contact the authors and see if you can use it for nucleotides.
There are 4 main steps in blastn.
1.Prepare the hash table with mask data.
2.Scan the hits in the database. And the -thread_num command only useful in this step.
3.Trace back the result in the database.
4.Print the result.
-thread_num command (multi-thread version in step 2) is better than multi-progress. Multi-progress will load database, mask database into RAM by each progress.
Our G-Blastn which speed up the scan step in GPU and speed up the trace back step by SSE, change the framework into pipeline, each step can be overlapped.
Legacy NCBI BLAST and NCBI BLAST+ do not support GPUs. The threading options ('-a' for legacy NCBI BLAST and '-num_threads' for NCBI BLAST+) refer to CPU based threads, and must be used in order for multi-threaded searches to be performed.
It builds, but seems not running for me. So:
ghostm db -i Rice.fasta -o Rice
ghostm qry -i q.fasta -t d -o query
ghostm aln -i query -d Rice -o /dev/stdout -v
(Rice.fasta and q.fasta are nucleotide database and query in FASTA format, and q has direct matches in Rice.fasta - verified with grep). No output is ever received. Even more, if I also specify -D 0 (GPU device ID), it always fails with error:Out of GPU memory (the card has 1 Gb so should not be that soon). But it also fails with exactly the same message if I specify a different GPU id of non-existing device.
If the authors/supporters could post any comments, would be great.