ncbi blast results not compatible with OrthoMCL program
0
0
Entering edit mode
7.3 years ago
xiachongjing ▴ 10

I am using blast 2.4.0+ blastp output format is -outfmt 6. Some of my results are:

PSTr|PSTG_00001T0 PSTr|PSTG_00001T0 100.000 138 0 0 1 138 1 138 3.41e-101 286 MLi|212373 gnl|MLi|212373 100.000 110 0 0 1 110 1 110 8.97e-78 226 PSTr|PSTG_00001T0 gnl|PST|PstP_06241T0 98.182 110 2 0 29 138 1 110 1.75e-77 226 PSTr|PSTG_00001T0 PSTr|PSTG_14461T0 72.993 137 36 1 1 137 1 136 4.63e-67 200 PSTr|PSTG_00001T0 gnl|PST|PstP_16337T0 72.593 135 36 1 3 137 1 134 2.45e-65 196 PSTr|PSTG_00001T0 gnl|PST|PstP_17038T0 67.669 133 41 2 6 137 46 177 3.64e-55 172

My first question is: why some of my subject IDs have gnl| in second column, some do not have?

In fact, I am running OrthoMCL, if I use the blast results above for subsequent OrthoMCL, for example, I run $ orthomclBlastParser my.blast myadjust.directory > similarSequences.txt

Then I got error: couldn't find taxon for gene 'gnl|MLi|212373' at /path/to/orthomclBlastParser line 105, <F> line 1.

So my second question is: can I just delete string gnl| in my blast results, then continue OrthoMCL ?

Thanks in advance.

blast OrthoMCL • 1.7k views
ADD COMMENT
0
Entering edit mode

I don't know the answer to your first question, but for the second, yes, I believe you can remove the "gnl|" (making a backup of the original file):

sed -i.bak s/"gnl|"/""/g my.blast

edit: for the first question, maybe it is related to how you created the blast databases? Did you concatenated and created the database all at once? What were the commands used?

ADD REPLY

Login before adding your answer.

Traffic: 1789 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6