Entering edit mode
10.4 years ago
Hello,
I am trying to convert a genbank file to a FASTA file. I have a problem, it seems like python can't recognizes the genbank codes, I don't know how to solve the problem
To prove that I search for an example found in some website, the script and terminal are below... Thanks for your answers
SCRIPT
from Bio import GenBank
gbk_filename = "NC_005213.gbk"
faa_filename = "NC_005213_converted.faa"
input_handle = open(gbk_filename, "r")
output_handle = open(faa_filename, "w")
for seq_record in SeqIO.parse(input_handle, "genbank") :
print "Dealing with GenBank record %s" % seq_record.id
for seq_feature in seq_record.features :
if seq_feature.type=="CDS" :
assert len(seq_feature.qualifiers['translation'])==1
output_handle.write(">%s from %s\n%s\n" % (
seq_feature.qualifiers['locus_tag'][0],
seq_record.name,
seq_feature.qualifiers['translation'][0]))
output_handle.close()
input_handle.close()
print "Done"
TERMINAL
albam@albam-TravelMate-P253:~/Desktop$ python gbk_to_faa.py
Traceback (most recent call last):
File "gbk_to_faa.py", line 6, in <module>
input_handle = open(gbk_filename, "r")
IOError: [Errno 2] No such file or directory: 'NC_005213.gbk'
A few suggestions for your script that make use of some better practices:
sys
, you can specify command line arguments to your scripts!file.open
andfile.close
you should use thewith open()...
syntax. It makes file operations a bit safer since the file will be closed automatically after completion of thewith open
statement.I am trying to work with a cDNA sequence from genbank, but the problem is that I don't know which function is the correct to open the sequence directly, from python, this sequence is not saved in my computer that's why I need to open it directly, whitout download it.
I don't know if yo understand me...
Thank you
You can't open a file that you do not have. You're telling python to find a file under ~/Desktop, but there is no file there. You have to have the file and tell the script where to get it, you can't just pull files out of no where.
Can you explain this a little more? I am trying to get the same script to work but don't like the hard coding of
NC_005213.gbk
. I want to be able to read many files and have each of them parsed to a new file. Can you explain how this can be done?