Hello,
I attempted to conduct a BLASTn analysis on multiple FASTA files using a for loop, but instead of generating individual output files for each queried FASTA file, it produced a single output file. Can you please help me with what I must change in my code to get the desired outcome?
Here's my code
#!/bin/bash
#SBATCH -J rep.euk.genomes
#SBATCH -N 1
#SBATCH --ntasks-per-node=64
#SBATCH -o %x.%j.out
#SBATCH -e %x.%j.err
#SBATCH -p nocona
#SBATCH --export=ALL
export BLASTDB=$BLASTDB:/lustre/research/phillips/rep_euk_genomes/db/
for f in *.fasta
do
outfile=blastn.${f}
/lustre/work/sneha/software/ncbi-blast-2.11.0+/bin/blastn -query ${f} -db /lustre/research/phillips/rep_euk_genomes/db/ref_euk_rep_genomes -max_target_seqs 5 -max_hsps 1 -out ${outfile} -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore sscinames scomnames"
done
I used the codes you suggested but still getting the same output file (only the first fasta file in the query). I also tried the script below:
I get this error while running this script:
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.No, you did not. Neither of the changes suggested by Michael are in your code.
Also, why on earth does your SLURM batch file use command line parameters? Are you sure you're using them properly?What is$B
in$B/bin/blastn
?I assume the command line argument is for running sbatch in a loop to submit multiple jobs like so:
for f in *.fasta ; do sbatch run_it.sh $f; done
. Depending on the scheduling policy of the cluster this may be much faster than the original code.That's not an option with slurm. One will need to export a BASH env variable or use a HEREDOC within a script as mentioned here: https://stackoverflow.com/a/44168719/1394178EDIT: I was wrong. SLURM scripts work fine with command line arguments.
I have done this several times on Saga and LUMI, they both use Slurm and it worked for me. In my understanding parameters can be passed to job scripts just fine.
Yes, you are correct. I dug deeper after I wrote my comment and discovered I was wrong.
should be
This is a shell command and not a slurm parameter. Using it with # just comments it out. Try the following code on one fasta file first and see how that works by looking at the .out and err files generated.
Not sure why a
>
is being used here instead of using-out blastn_out_rep_genomes/${resultfile}
.When I run this code I get this error:
How are you running it? The error message is pretty clear BTW
You have a superfluous
-outfmt
that causes the error. This looks like an attempt to run a single file per sbatch job in a for loop. You still should change theresultfile=blastn.${infile}
toresultfile=${infile}.blastn
to avoid problems in the future.