Hello!
I am trying to translate an RNA sequence into a protein by using a dictionary. If the codon is not found in the dictionary, I used get() to get the alternative option of the value. However, I got a KeyError while running my code like this: python3 translate_rna.py "AUGUNCGGU"
. Could you tell me what is wrong? Thank you for your help.
def translate_rna(mRNA):
"""Return a translated sequence from an mRNA sequence
mRNA -- str, mRNA sequence
"""
dict_amino_codons = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
"UCU":"S", "UCC":"s", "UCA":"S", "UCG":"S",
"UAU":"Y", "UAC":"Y", "UAA":"STOP", "UAG":"STOP",
"UGU":"C", "UGC":"C", "UGA":"STOP", "UGG":"W",
"CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
"CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
"CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
"CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
"AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
"ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
"AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
"AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
"GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
"GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
"GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
"GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",}
complete_protein_seq = "" #String of the complete protein sequence
last_codon_start = len(mRNA) -2 #Any sequence length is analyzed
for letter in range(0, last_codon_start, 3):
mrna_codon = mRNA[letter:letter+3]
if dict_amino_codons[mrna_codon] == "STOP":
break
else:
complete_protein_seq += dict_amino_codons.get(mrna_codon, 'X') #complete_protein_seq += dict_amino_codons[mrna_codon]
print(complete_protein_seq)
if __name__ == '__main__':
input_file = argv[1]
translate_rna(input_file)
This is the error given:
Traceback (most recent call last):
File "translate_rna.py", line 46, in <module>
translate_rna(input_file)
File "translate_rna.py", line 38, in translate_rna
if dict_amino_codons[mrna_codon] == "STOP":
KeyError: 'UNC'
The problem is you aren't accounting for the situation whereby an unknown codon is passed to your
STOP
check. Theget()
line isn't actually the line throwing you the error.When your code looks for
UNC
in the dictionary, it finds neither the codon, nor a stop codon, and has no default value (unlike your.get()
) so it breaks.Basically you just need to wrap
if dict_amino_codons[mrna_codon == "STOP"
with some logic for handling ambiguous codons, this is probably a good use for atry/except
block.Thank you for your help. On the other hand, I don't understand why the default of get() is not working as (copied from a website) get() "returns a value for the given key. If key is not available then returns default value".
How do you know
get()
isn't working? Your code is failing before it even gets to that line. As far as I can tell, theget(key, default)
approach should work fine but you have to address the upstream error.Thank you for your help. I will correct it.
Thank you for your help. However, I faced another problem described below. Hope you could help me out. Thanks