Hello!
I have recently started learning how to use unix for bioinformatics and I am having some problems right now with carrying out blastp against local databases I created.
My question is, is there a way to run blastp over multiple databases in the same run? My database is too large to be contained in one fasta file and the command only works for me if I specify a single fasta file as the database. If I write the following command I get an error:
blastp -db nr.*.fa -query At.fsa -out blastresults.out
Error: Too many positional arguments (1), the offending value: nr.AAXJ.fa
Error: (CArgException::eSynopsis) Too many positional arguments (1), the offending value: nr.AAXJ.fa
Thank you, any help would be appreciated, Lisa.
EDIT: I should also mention I have over 1500 databases.
You need to create blast indexes using the fasta sequence files before you can do blastp. You can concatenate the 1500 files (I assume that is what you mean by "databases") into a single multi-fasta file with
cat nr*.fa > concatenated.fa
. Create blast indexes with commandmakeblastdb -in concatenated_file.fa -parse_seqids -dbtype prot -out mydata
. Then run the search withblastp -db mydata -query At.fsa -out blastresults.out
You will find command line manual for blast+ a useful reference for the whole process. Add additional options/directory paths as needed to above commands.