Question

How detect complement reverse sequence in a EMBL/Genbank file with biopython?

0

Entering edit mode

9.7 years ago

uguy • 0

Hi,

I'm trying to parse GenBank file and extract the genes sequences, but I didn't find how to detect the complement sequences. I tried regex by searching "complenent" in "feature.location" but that didn't work.

If you are some ideas you're welcome :)

genome • 2.4k views

ADD COMMENT • link updated 2.5 years ago by Ram 44k • written 9.7 years ago by uguy • 0

Ram · Accepted Answer · 2015-03-26

Each feature object has a .strand (actually an alias for the feature's location's strand, .location.strand), e.g.

from Bio import SeqIO
for record in SeqIO.parse(filename, "embl"):
    print(record.id)
    for feature in record.features:
        print(feature.strand)
        print(feature.extract(record.seq))

The last line of the example shows how to get the sequence associated with the feature.

See also the SeqFeature built in documentation, also available at http://biopython.org/DIST/docs/api/Bio.SeqFeature-module.html

As explained there and in the Biopython Tutorial http://biopython.org/DIST/docs/tutorial/Tutorial.html the .strand will be -1 for features on the complement strand.