Problems With Biopython When Running The Ncbistandalone.Py Program
2
1
Entering edit mode
12.1 years ago
saima ▴ 10

Hi, I am having problem while running NCBIStandalone library of biopython. i want to retrieve sequence titles from output of blast but it gives error on iterator.

Following is the code to retrieve these sequence titles.

result_handle= open("foo.txt")
blast_parser = NCBIStandalone.BlastParser()
blast_iterator = NCBIStandalone.Iterator(result_handle, blast_parser)
print blast_iterator
for blast_record in blast_iterator:
  print blast_record
  E_VALUE_THRESH = 0.0
  for alignment in blast_record.alignments:
      for hsp in alignment.hsps:
          if hsp.expect < E_VALUE_THRESH:
              print 'sequence:', alignment.title

But it gives following errors. it makes iterator properly but does not loop it.

<Bio.Blast.NCBIStandalone.Iterator object at 0x1aa6450>
Traceback (most recent call last):
  File "blast2.py", line 45, in <module>
    for blast_record in blast_iterator:
  File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 1659, in next
    return self._parser.parse(File.StringHandle(data))
  File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 818, in parse
    self._scanner.feed(handle, self._consumer)
  File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 112, in feed
    read_and_call_until(uhandle, consumer.noevent, contains='BLAST')
  File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 337, in read_and_call_until
    line = safe_readline(uhandle)
  File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 413, in safe_readline
    raise ValueError("Unexpected end of stream.")
ValueError: Unexpected end of stream.

Any help would be highly appreciated. Thanx Saima

biopython blast • 4.1k views
ADD COMMENT
0
Entering edit mode

The message "Unexpected end of stream." means the parser reached the end of the file before it expected it. Either your file is truncated, or the format has changed slightly (again). I hope you'd read the tutorial which warns that the plain text parser is fragile and this is (almost) to be expected if you try the latest BLAST release?

ADD REPLY
0
Entering edit mode
12.1 years ago

The NCBIStandalong.Iterator object can't be looped over with the "for x in list" syntax. You need to use the .next() function to loop through the file. So something like this:

while 1:
   record = blast_iterator.next()
   if record is None
      break

   #do stuff with your record

Yeah, the syntax is not very pythonic and kinda ugly.

ADD COMMENT
0
Entering edit mode

thanx for your help but record = blast_iterator.next() gives the same error.

traceback (most recent call last):
File "blast2.py", line 46, in <module>
record = blast_iterator.next()
File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 1659, in next
  return self._parser.parse(File.StringHandle(data))
File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 818, in parse
self._scanner.feed(handle, self._consumer)
File "/usr/lib/pymodules/python2.7/Bio/Blast/NCBIStandalone.py", line 112, in feed
read_and_call_until(uhandle, consumer.noevent, contains='BLAST')
File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 337, in read_and_call_until
line = safe_readline(uhandle)
File "/usr/lib/pymodules/python2.7/Bio/ParserSupport.py", line 413, in safe_readline
raise ValueError("Unexpected end of stream.")
ValueError: Unexpected end of stream.

I don't know what is the reason?

ADD REPLY
0
Entering edit mode

Ugly yes. Actually it is just old-fashioned Python code before iterators were made easier to use. This module will probably be formally deprecated, but see also the forthcoming Biopython SearchIO module which will offer a more consistent API, http://biopython.org/wiki/SearchIO

ADD REPLY
0
Entering edit mode
12.1 years ago
bow ▴ 790

Looks like a problem with the BLAST output file you're trying to parse. I tried your snippet with my own sample file and it works ok. Can you provide a sample output or attach the file? What BLAST version did you use to generate the file?

Also, your snippet sets a threshold limit of '< 0.0'. You will have to increase this limit if you want to see any sequence alignment titles, since you can't have negative E-values.

ADD COMMENT
0
Entering edit mode

Blast version is BLASTN 2.2.27+, output file is quiet large and i don't know how to upload it . Sorry for that.(can you plz help me for that?)

ADD REPLY
0
Entering edit mode

The simplest answer is that recommended in the Biopython tutorial, don't use the plain text BLAST output. The XML is very detailed, but for many tasks the simple BLAST tabular output is smaller and easier to work with.

However, if there is a new problem with plain text from BLASTN 2.2.27+ we can try to fix the parser.

ADD REPLY

Login before adding your answer.

Traffic: 2305 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6