I have a folder with multiple blast databases. I want to run blastn over all databases and produce one output for each database.
I'm trying something like that
for i in `find . -name 'name_of_database'`; do
time blastn -db "$i" -query Sondas_100.fasta -out "$i".out -outfmt 7 -num_threads 16 -dust yes -ungapped
done
But this options search for a filename, and the blast databases are alias
When I say "aliases" I refer to the name of the blast database is not a file but a name that represents the files.
Example: The makeblastdb produces 3 files with names: T1P1T0.nhr, T1P1T0.nsq and T1P1T0.nal, but the name of the blast database to pass to blastn script is only T1P1T0 without its extensions.
Because my databases have this name structure: T"x"P"x"_T"x" when x is a 1 to 4 number
I create all the strings and passed it in the blast command
#!/bin/bash
for T in `seq 1 4`; do
for P in `seq 1 4`; do
for t in `seq 0 3`; do
time blastn -db /vault2/homehpc/jmalagont/dllopezr/Shotgun_Seq/Trimmed_Seqs/FastaSeqs12/$"T"$T"P"$P"_T"$t"_R1" -query Sondas_100.fasta -out ""T"$T"P"$P"_T"$t"_R2"".out -outfmt 7 -num_threads 16 -dust yes -ungapped
done
done
done
All you need to do is strip the extension off the result of your find command:
e.g.
for i in $(find . -name 'name_of_database.nhr') ; do
database="${i%.*}"
time blastn -db "$database" -query Sondas_100.fasta -out "$i".out -outfmt 7 -num_threads 16 -dust yes -ungapped
done
Add the extension in the actual find command to ensure it only finds each database once, rather than once per related file, then strip the extension off, and pass the new path which should correspond to the basename of the database.
Do you need an output per DB or will one output over all DBs do it as well?
What exactly do you mean with 'aliases'?
Hi Lieven
When I say "aliases" I refer to the name of the blast database is not a file but a name that represents the files.
Example: The makeblastdb produces 3 files with names: T1P1T0.nhr, T1P1T0.nsq and T1P1T0.nal, but the name of the blast database to pass to blastn script is only T1P1T0 without its extensions.
And yeah! I want a otuput for each database