Hello everyone,
I have two different folders,
- in folder1 I have 4000 fasta sequences.
- in folder2 I have blast database for each sequence. it means 3 files for each .
I want to run blastp for each fasta in folder1 with each database in folder2.
I tried it with a for loop and if loop. It did not work for me .
if loop example given below ,
#!/bin/bash
blastp -query path to folder1/"$1".fasta -db path to folder2/"$1"_db -out test
This gives me the result for fasta 1 with database 1.
I want the result as fasta 1 with all databases then fasta 2 with all databases and so on.
What can be a better way?
Have you thought of using two nested
for
loops? Outer loop goes through samples one at a time. Inner one does the same for blast DB.Hi, NO I have not tried that . Can you please give me an example for the same .
Thank you
Is this still your previous question framed in a different way again? Did you read and understand what was comented there? Amino Acid identity matrix help
NO its not the same question it is a different query itself .
well, it looks it is about the same 4000 proteins. Anyway, why don't you simply concatenate all sequences into one big blast database and then blast it against itself? Then you get the complete output in one run. No need for loops here.
Yes , because I am working on the same dataset but my question is different here . Here I am asking for how to loop the command that I want to perform .
Ok, no need for a loop...
if I run the entire protein fasta with entire db it will give me top 500 hits . there is a limitation . I want all the hits not just top 500
What does this mean?
I guess op means the three filed you get when you make a blast database from a fasta file:
yes this three for each