A Tool For Flagging Blast Searches?
1
1
Entering edit mode
10.8 years ago
ishengomae ▴ 110

I want to know if there is a blast way or a blast-like tool in which a query will search through a database and return only predetermined hits based on given flags like sequence fasta ids. To be more specific, I want to perform a tblastn via command-line blast against my fasta formatted customized nucleotide database. So for example if my query file consists of multiple (many) refseq protein sequences in fasta format and if there is a relationship between my queries and my database sequences (i.e. for each protein query there are four orthologous nucleotides sequences corresponding to it), is it possible to return only the hits which exactly correspond to your query(ies)?

Thanks.

command-line blast+ • 2.1k views
ADD COMMENT
0
Entering edit mode
10.8 years ago
Daniel ★ 4.0k

Do you mean to only return blast results which are in your refseq list and give 100% match? In that case, after doing a blast (-m 8) you could pull out the refseq matches and then extract the 100% lines with something similar to this:

grep -r refseq_list.txt my_blast_result_m8.txt | awk '{if ($3 == "100.00") print $0}' > match_results.txt

(or as two steps)

grep -r refseq_list.txt my_blast_result_m8.txt > tmp.txt
awk '{if ($3 == "100.00") print $0}' tmp.txt > match_results.txt

Alternatively you could have blast return only 100% matches and miss that second step out

ADD COMMENT

Login before adding your answer.

Traffic: 2681 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6