Entering edit mode
5.7 years ago
kilikun
•
0
Hey,
even though I've used BLAST for various things in the past, I've encountered a problem, that I could not solve myself.
I've a list of taxid's or organism names with complete genomes (n=400) and a gene sequence/protein sequence (roughly 450 AS/ 1500 nt) from my model organism. I want to retrieve the most similar hits of this gene sequence in each of the 400 genomes. How can I do that online or by using a Biopython script?
I know, that I can restrict a BLAST search to one or up to 20 genomes using the NCBI website, but not more than that.
Best, kilikun
Make a database with 'makeblastdb' from all your genomes.
Then run blastx in this database.
See my old post with some details:
A: trouble in blasting the seq file
To select between blastp and blastx, see the following posts:
A: which one is better to do, blastx or blastp?
which one is better to do, blastx or blastp?
If that is the only way then 20 rounds of online blast to cover the 400 genomes.
Otherwise you could blast locally against
nr
and restrict to taxID's you are interested in.