Entering edit mode
9.5 years ago
dago
★
2.8k
I saw that many people had problem retrieving sequences from blast db.
I could find a way around it, so maybe someone has a god link/suggestion/reference I could use.
I want to extract sequences from nr db.
I have a list of identifier, obtained from a previous blast search
gi|740719731|ref|WP_038505017.1|
gi|740813732|ref|WP_038599015.1|
gi|740864652|ref|WP_038649903.1|
gi|740899195|ref|WP_038684443.1|
gi|740906294|ref|WP_038691542.1|
Now I try to query only
GIs:
740864652
740899195
740906294
or ref:
WP_038649903.1
WP_038684443.1
WP_038691542.1
blastdbcmd -db ~/Documents/nr_blastdb/nr -entry_batch Ids
But I get always:
Error: XXXXX: OID not found
What am I missing here?
Not quite sure. I got it from a colleague. Could it be a problem related to the index of the entries?
What does
cat ~/Documents/nr_blastdb/nr.pal
show?If it's a pre-formatted db, the title line is something like:
Perfect, was a manually created db. I will try to follow your first suggestion
You are right, it worked fine now. However, for some seq now there are really crazy Ids, for example
It looks like two Ids one after the other. Any idea where the problem is?
It's a non-redundant database and those two accessions encode an identical protein. You can avoid this behavior with the
-target_only
flag..I learned many things today! Thanks very much!!!!