How To Blast A Sequence Against Multiple Databases
6
7
Entering edit mode
13.2 years ago
Manju ▴ 50

Hello,

I have downloaded all the chromosome of Bos taurus and I have changed them in blast format using makeblastdb..and now I want to locally blast my sequence against these all chromosomes. now I have 29 databases. Is there any method by which I am able to blast my sequence against all 29 databases in my program.

What should I write in database????

@params = ('database' => '????????', 'outfile' => 'blast2.out', '_READMETHOD' => 'Blast', 'prog'=> 'blastn');

Thanks Manju Rawat

blast • 18k views
ADD COMMENT
0
Entering edit mode

Hi,

This is kind of an old topic, nevertheless my question is highly related to this topic. As you may well know, ncbi gives the taxdb files along with its nr, nt, 16s, its and 18s databases. These files aren't text files (extensions: .bti and .btd), thus I can't just cat them and merge them together. Is there any way to using multiple of these databases along with the associated taxdb files? I couldn't find a particular way to incorporate them. Any help would be appreciated.

Thank you

-Berkay

edit: SOLVED

ADD REPLY
0
Entering edit mode

taxdb files are common for all databases. Does that address your question?

ADD REPLY
0
Entering edit mode

Thank you for the answer, I actually found this out right after writing the question here. I coudn't see where the "remove the comment" option was, or at the very least I should have edited it as solved. Again, thank you.

ADD REPLY
11
Entering edit mode
13.0 years ago
Torst ▴ 980

There is no need to create custom single databases (real or alias).

I'm pretty sure the BLAST "-d" and BLAST+ "-db" accept SPACE-SEPARATED multiple databases on the one command line. You just have to ensure you quote them in your shell, or within a Perl string. eg.

blastall -p blastp -d "nr sprot trembl" -i q.fa
ADD COMMENT
9
Entering edit mode
13.2 years ago
Digiomics ▴ 170

You could also use the blastdb_aliastool included in the NCBI blast package to aggregate your BLAST databases to a single virtual database.

ADD COMMENT
8
Entering edit mode
13.2 years ago
Chris ★ 1.6k

Why not creating a single database by merging the chromosomes into one big fasta file and then formatting it?

Not sure what you mean by this: @params = ('database' => '????????', 'outfile' => 'blast2.out', '_READMETHOD' => 'Blast', 'prog'=> 'blastn');

ADD COMMENT
2
Entering edit mode

This answer is bad. Why is everyone upvoting it? That is so much more work and may not even be possible in some cases to do.

ADD REPLY
1
Entering edit mode

I don't see how this approach is either bad or much more work. The objective of the OP is to have a single database. Chris proposes to merge the fasta files and create one database while you suggest to combine existing databases. In this case, the OP had access to the original fasta files, which a simple 'cat' command would have joined and then he could have used a command he already knew about and had used to create his databases. Clearly, both solutions seem workable. Your solution also happens to be the second most upvoted solution, which has been there for over two years. It seems to me that there is no need to call any answer bad and to question the judgement of multiple users over the course of 2.3 years. Even less on the first day you joined this forum ;)

ADD REPLY
1
Entering edit mode

The answer is bad. The user asked how to blast against 29 databases. "Is there any method by which I am able to blast my sequence against all 29 databases in my program?"

The answer "go compile a new database" is an indirect work around to the problem. It may even be good advice in this particular instance - but it did not answer the question asked.

Why don't I want to do that? Because it's moronic to create multiple permutations of databases when I already have them compiled. This means my usage of disk space balloons every time I do a search. Not to mention it is annoying and time consuming.

I came here with the exact same question. Thankfully there are several good options provided by others below.

ADD REPLY
0
Entering edit mode

If you have limited RAM (4Gb) then sometimes it is not possible to concatenate all the chromosome fasta files and then create the database at once. I just tried this on a 4GB RAM laptop with wheat genome (17Gb) and it stalled the laptop. Creating databases for individual chromosomes and then combining those databases with blastdb_aliastool is a much better option in these cases.

ADD REPLY
0
Entering edit mode

It looks suspiciously like he's trying to run blast via a perl script.

ADD REPLY
0
Entering edit mode

Ah, right, now I remember. The '@'-construct is a hint... Well, since I'm in Python I do not come across such things very often. ;)

ADD REPLY
4
Entering edit mode
13.0 years ago

You can do your alias easily by yourself:

example:

cd myblastdbs/
gedit myOwnblastDB.nal

then follow this (an example is the genbank nt.nal file):

#
# myOwnblastDB.nal is an alias file for all my chromosomes
#
#
TITLE all my chromosomes
#
DBLIST chr1 chr2 chr3 chr4 chri chr24
#
#GILIST
#
#OIDLIST
#

It is very simple, and works very fine!

ADD COMMENT
1
Entering edit mode
10.9 years ago
axa9070 ▴ 30

http://www.ncbi.nlm.nih.gov/books/NBK1763/#CmdLineAppsManual.Aggregate_existing_BLA

To combine the two nematode nucleotide databases, named “nematode_mrna” and “nematode_genomic", we use the following command line:

$ blastdb_aliastool -dblist "nematode_mrna nematode_genomic" -dbtype nucl \ -out nematode_all -title "Nematode RefSeq mRNA + Genomic"

ADD COMMENT
0
Entering edit mode
13.2 years ago

If you place your 29 chromosomes in the same fasta file, you should be able to create your single database using the makeblastdb program. Just use blastn then to blast on the database containing all your chromosomes.

ADD COMMENT

Login before adding your answer.

Traffic: 1644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6