parsing gbk file to get fna file
1
0
Entering edit mode
5.4 years ago

I'm using a script to get .fna files from .gbk files

import sys
from Bio import SeqIO

lista = open('list_gbk.txt')  
for line in lista:
        line = line.rstrip()  
        fasta = line+".fna" 
        sys.stdout=open(fasta,"w")
        for rec in SeqIO.parse(line, "genbank"):
                if rec.features:
                        for feature in rec.features:
                                if feature.type == "CDS":
                                        print ">", feature.location, feature.qualifiers['product'],"\n",feature.location.extract(rec).seq
        sys.stdout.close()

But I'm getting the next message error

Traceback (most recent call last):
  File "parser_gbk_2.py", line 14, in <module>
    print ">", feature.location, feature.qualifiers['product'],"\n",feature.location.extract(rec).seq
KeyError: 'product'
python • 1.8k views
ADD COMMENT
0
Entering edit mode

Side note: Looks like you're using python2.7 - time to abandon that and switch to python3.6. Python 2.7 will be retired in under 6 months: https://pythonclock.org/

ADD REPLY
0
Entering edit mode
5.4 years ago

Hello,

you have a CDS feature that hasn't a "product" information. This is why python could'nt find that key.

As RamRS it looks like you are using python2 and it's time to python3. Here are some more hints on your code if you like:

line = line.rstrip()  
fasta = line+".fna"

In python 3.6 you can use f-Strings to shorten this to:

fasta = f"{line.strip()}.fa"

sys.stdout=open(fasta,"w")

I don't see any reason why you are using sys.stdout here to write to a file. It's better to use the with statement. So python will take care for you about closing the file again.

print ">", feature.location, feature.qualifiers['product'],"\n",feature.location.extract(rec).seq

In python3 print is a method. So you would need to use print(...).

So at all you code could look like this:

from Bio import SeqIO

with open("list_gbk.txt") as lista:
    for line in lista:
        with open(f"{line.strip()}.fa", "w") as outfile:
            for rec in SeqIO.parse(line, "genbank"):
                if rec.features:
                    for feature in rec.features:
                        if feature.type == "CDS":
                            outfile.write(
                                f">{feature.location} {feature.qualifiers['product'] if 'product' in feature.qualifiers else ''}")
                            outfile.write(feature.location.extract(rec).seq)
ADD COMMENT

Login before adding your answer.

Traffic: 1629 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6