Thanks Pierre. That helps some. There is still something I must be missing though. You are right that ExtendedIUPACProtein uses J
. So in that case, based on my example, WTA
would be the corresponding codon. I still don't see where that gets mapped to J
.
ExtendedIUPACDNA
calls W
as wyosine, (which I don't even know what that is...googling)
http://biopython.org/DIST/docs/api/Bio.Alphabet.IUPAC.ExtendedIUPACDNA-class.html
B = 5-bromouridine
D = 5,6-dihydrouridine
S = thiouridine
W = wyosine
but "normal" DNA ambiguity codes are here in IUPACAmbiguousDNA
.
letters = 'GATCRYWSMKHBVDN'
W
traditionally codes for T
or A
if I get more specific in my example and specify an alphabet
>>>from Bio.Seq import Seq
>>>from Bio.Alphabet.IUPAC import *
>>>c = Seq('ARAWTAGKAMTA',IUPACAmbiguousDNA)
>>>c.translate().tostring()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mark/Downloads/biopython-1.54/build/lib.macosx-10.6-universal-2.6/Bio/Seq.py", line 930, in translate
File "/Users/mark/Downloads/biopython-1.54/build/lib.macosx-10.6-universal-2.6/Bio/Alphabet/__init__.py", line 213, in _get_base_alphabet
AssertionError: Invalid alphabet found, <class Bio.Alphabet.IUPAC.IUPACAmbiguousDNA at 0x10057c230>
Bad things happen. So I am still not quite understanding the ins and outs of this. The translate call goes to CodonTable where 1) I still don't see a J'
and 2) I don't understand this new error.
Thanks!
Mark
Thanks Pierre, please see my followup -Mark