To do blastp iteratively using terminal script
2
0
Entering edit mode
7.7 years ago
sinumolg ▴ 10

I have one input file containing 200 fasta query sequence. I want to do standalone blast for each fasta sequence separately . I am expecting my outfile of blast should be named its corresponding query fasta header name. How to do it using linux terminal scripts? Can anyone help me

blast sequence next-gen R • 1.4k views
ADD COMMENT
1
Entering edit mode
7.7 years ago

Using seqkit split(download, usage)and then blast.

seqkit split -i --id-regexp '(.+)' -o output_dir input_seq.fa

File names would be in format of input_seq.id_cel-let-7 MI0000001.fa, let's delete the prefix input_seq.id_:

cd output_dir; 
rename "input_seq.id_"  ""  *.fa
ADD COMMENT
1
Entering edit mode
7.7 years ago

(not tested)

linearize and loop

awk '/^>/ {printf("%s%s\n",(N>0?"\n":""),substr($0,2));N++;next;} {printf("%s",$0);} END {printf("\n");}'  inputfasta   | while read F; do echo ">$F" > tmp.fa ; read S ; echo $S | fold -w 60 >> tmp.fa ; blastn -db my.db -query tmp.fa > $F.blast ; rm tmp.fa; done
ADD COMMENT

Login before adding your answer.

Traffic: 2331 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6