I have a data set with the following:
Accession Sequence
CC123456 XNFGYXLAKKK
CC123457 XNFGYXKAKKK
I want to be able to search this data set to look for the motif XLAK
, and output a list containing [Accession(String), Present(Boolean)]
pairs. However, the Bio.motifs module doesn't have sufficient information to help me write this out.
Sample code that I have thus far looks as follows:
import Bio.motifs as motifs
motifs = motifs.Motif()
motif.add_instance(Seq('XLAK'))
After that line, I get an error saying "Motif object has no attribute 'add_instance'
.
We were planning to run over a list of [Accession(String), Sequence(Seq)]
pairs and search each Sequence for that motif, and then output a separate list as mentioned above. Can anybody provide a starting point for doing this please?
May be this might work:
from Bio.Motif import Motif
m = Motif()
You have to add the sequence as a Seq object, i.e.
from Bio.Seq import Seq
You may have to specify alphabet also
from Bio.Alphabet import IUPAC
As per the manual.