Biopython Embl Parser Only Reads One Entry
1
0
Entering edit mode
11.6 years ago
sinanugur ▴ 10

Hello,

I am trying to parse genome records in EMBL format. Everything seems OK and without exception but parser only reads first record. Here is my code to parse EMBL file,

from Bio import SeqIO

for record in SeqIO.parse("AE000657.1.embl","embl"):
        print record.id

This script only returns:

AE000657.1.

That is all, the other genomic regions are not printed. The link of the file is this one: http://www.ebi.ac.uk/ena/data/view/AE000657&display=text

EMBL file is OK and in fact it can be opened by Artemis. Thus, it is not a corrupted file. So what is the problem here? Thanks

python biopython • 3.6k views
ADD COMMENT
1
Entering edit mode

Could you edit your question to include a URL to the test file? Without that this isn't going to be easy to assist you with.

ADD REPLY
0
Entering edit mode

Yep I edited my question.

ADD REPLY
1
Entering edit mode

Not sure what the issue is. Your file contains one sequence record and the code prints its ID, as expected. Maybe you want FT lines as suggested in Peter's answer?

ADD REPLY
9
Entering edit mode
11.6 years ago
Peter 6.0k

In EMBL, each record starts with an "ID" line and ends with a // line, and your EMBL file as shown here does really only contain one record. The Biopython parser is therefore working as designed.

I would guess what you are looking for is the features, i.e. the information on the FT lines (Feature Table). These get parsed into SeqFeature objects in Biopython, held as a list as the features property of the SeqRecord object. Note for for single sequence files, you may find it simpler to use the read function:

from Bio import SeqIO
record = SeqIO.read("AE000657.1.embl","embl")
print "Record %s has %i features" % (record.id, len(record.features))
ADD COMMENT
0
Entering edit mode

Thanks, I wanted to parse features. I thought I can iterate through those features via SeqIO.parse but now I get that. Cheers.

ADD REPLY
0
Entering edit mode

Great.

P.S. On BioStars (like StackExchange) you are expected to mark an answer as accepted if it solves your problem - this is used for the user profile ratings etc.

ADD REPLY

Login before adding your answer.

Traffic: 2443 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6