Entering edit mode
4.3 years ago
Begonia_pavonina
▴
200
Hello everyone,
I have an issue with the mgkit tool: blast2gff. I am trying to output a GFF file from a BLAST result (with the -outfmt 6 format). Here are the command, then the error message:
blast2gff blastdb output_blast.out output_blast.gff
INFO - mgkit.workflow.blast2gff: Writing to file (output_blast.gff)
INFO - mgkit.io.blast: Reading BLAST results from file (output_blast.out)
Traceback (most recent call last):
File "/cluster/home/usr/.conda/envs/mgkit/bin/blast2gff", line 11, in <module>
sys.exit(main())
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/mgkit/workflow/blast2gff.py", line 207, in convert_from_blastdb
for annotation in iterator:
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/mgkit/io/blast.py", line 200, in parse_uniprot_blast
value_funcs=value_funcs):
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/mgkit/io/blast.py", line 136, in parse_blast_tab
for index, func in zip(ret_col, value_funcs)
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/mgkit/io/blast.py", line 136, in <genexpr>
for index, func in zip(ret_col, value_funcs)
File "/cluster/home/usr/.conda/envs/mgkit/lib/python3.7/site-packages/mgkit/workflow/blast2gff.py", line 192, in name_func
return x.split(header_sep)[gene_index]
IndexError: list index out of range
Would anyone have encounter the same issue?
Thanks for your answer zorbax! I am using the 6 output format, and the default outputs are as following:
Following your advice, I have tried the two following formats:
And it gives the same error message. Is it the format you were showing to me, or did I miss someting?
Issue is likely with
Scaffolds_2465_pilon
field. As shown below NCBI's fasta headers have identifiers separated by a|
character e.g.sp|O14830|PPE2_HUMAN
. @zorbax is referring to just second field (hit
) in blast output.Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.It does work! Thanks genomax. Apologizes for the incorrect use of the Answer button.
Great! You can accept @zorbax's answer (green check mark) to provide closure to the thread.
To note: there is actually an option to input a different separator for the header. https://mgkit.readthedocs.io/en/0.4.1/scripts/blast2gff.html#cmdoption-blast2gff-blastdb-s
But trying this with tab is not working, for some reasons...
I am having same issue. I even removed lines with '|' in names, still the error is same
a few lines from my blast file:
Please help!!
The IndexError is raised when attempting to retrieve an index from a sequence (e.g. list, tuple), and the index isn’t found in the sequence. The Python documentation defines when this exception is raised:
Raised when a sequence subscript is out of range. (Source)
Here’s an Python Split() example that raises the IndexError:
data = "one%two%three%four%five" numbers = data.split('%')
The list numbers has 5 elements, and the indexing starts with 0, so, the last element will have index 4. If you try to subscript with an index higher than 4, the Python Interpreter will raise an IndexError since there is no element at such index.