How to use seek() with Biopython SeqIO object
1
0
Entering edit mode
10.2 years ago
Renesh ★ 2.2k

I would like to iterate fasta file again and again using Biopython SeqIO object. The python seek (0,0) function can do this. But I am not getting expected output when I used it with biopython SeqIO object. How can I iterate fasta file from the beginning using biopython SeqIO object?

python • 3.0k views
ADD COMMENT
1
Entering edit mode

Is the performance hit of closing and reopening the file that bad?

ADD REPLY
1
Entering edit mode
10.2 years ago
David W 4.9k

As far as I know you can't use seek on a file object what's you've made a SeqIO generator from it (or, you can but it won't change the generator). You have plenty of other options is you want to iterate over the records many times:

  • Read the whole thing into memory as a list
  • Use SeqIO.index and the itervalues() method of the resulting object
  • Create a new generator every time you iterate:
handle = open("my.fasta", "r")
[rec.id for rec in SeqIO.parse(handle, "fasta")]
handle.seek(0)
[len(rec) for rec in SeqIO.parse(handle, "fasta")]

As Devon suggests, I doubt using seek rather just re-opening the handle each time will make much difference performance-wise.

ADD COMMENT

Login before adding your answer.

Traffic: 1661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6