I'm trying to find motif occurences in the upstream:
upstream = Seq(mysequence, IUPAC.unambiguous_dna)
pssm.calculate(upstream)
get error:
ValueError: Wrong alphabet! Use only with DNA motifs
if I check alphabets:
>>>pssm.alphabet
IUPACUnambiguousDNA()
>>>upstream.alphabet
IUPACUnambiguousDNA()
The source code where is fails:
if self.alphabet!=IUPAC.unambiguous_dna:
raise ValueError("Wrong alphabet! Use only with DNA motifs")
What could I do wrong?
What's really the difference between IUPACUnambiguousDNA()
and IUPAC.unambiguous_dna
?
I agree with your diagnosis and the suggested workaround Istvan. We'll try to fix this... http://lists.open-bio.org/pipermail/biopython-dev/2013-June/010639.html - Thanks!
Indeed
I'm not sure I use alphabet in the constructor properly:
What I also tried to do and it didn't work:
previously it was
The likely reason is that I create
pssm
in one run andcPickle
it and then unpickle when I need to apply. It works fine if I create pssm in the same run as I apply, but constructing matrix takes time, so I'd like to figure out how to pickle it if it's possible at all.Pickling and unpickling will result in a new instance of the alphabet class, and you'll hit the bug again. You'll have to wait for the next Biopython release where this is fixed, or apply the fix locally. See the mailing list discussion linked to above.