I have a few hundred yeast sequences (20-80bp long) and I want to find common motifs (conserved bases at certain indices) in them. I am using a Mac
I have a few hundred yeast sequences (20-80bp long) and I want to find common motifs (conserved bases at certain indices) in them. I am using a Mac
I would recommend MEME as others, however Weeder is also very easy to install and run on a Mac. It is word based and runs a lot faster than Meme, with a far fewer options.
If you would like something to run specifically on your Mac you could try iMotifs, which incorporates the NestedMICA algorithm for discovering over-represented motifs.
For a small set of sequences, such as yours, you could also use Meme, Weeder or others online without installing them locally. Finally, the Regulatory Sequence Analysis Tools (RSAT) (http://rsat.ulb.ac.be/rsat/ or http://rsat.ccb.sickkids.ca/) are a great place to start for pattern matching and discovery.
In case it is of any use I have put up a short presentation about DNA motif finding from a course I gave earlier this year. You can find it here.
You can also use MEME: http://meme.sdsc.edu/.
Some time ago I used SOMBRERO with a good degree of success on finding motifs in a very diverse set of sequences. They have a Mac version for download as well as parallel versions for Irix and Linux.
The first step when looking for conservation of single bases or motives is often a multiple sequence alignment that will align the sequences in a way such that conserved regions are best visible. This can be a first step before using explicit motif finders like MEME. A good way of visualizing multiple alignments is the sequence-logo that will give a graphical representation of base conservation.
Here is the wikipedia list of multiple-sequence alignment tools.
I recommend to start with the EBI web-server of ClustalW though, if that is not enough you can also try MAFFT or T-Coffee.
Weblogo can generate sequence-logo graphics from the output and also from fasta input directly.
Advantage of these tools is that you don't need to install them, so good for a first attempt irrespective of using a Mac.
Meme has been the first program to be published for doing that.
As an alternative you can find one of the EMBOSS tools; if you are scared by a terminal and want to do it from a web-based interface, you can use the EMBOSS tools from galaxy
You may check out these pages:
These are ca 2 years old (links may not work etc.) but as a starting point should be OK. Also in unlikely case you did not found it yet: in yeast there has been an extensive motif search study done by Kellis with insane number of citations:
Try this out? http://fraenkel.mit.edu/webmotifs/form.html
ACGGGCCCGACGATGCGTCGTA
ACGTACGTCGAACCGTCGTCGT
ACGTGCGTCGAAACGTCAGTCG
ACGGGTTCGATCGTCGTCGTCG
may be in Python I will break down the first sequence of required motif length into a sliding window and will search for those list of motifs in the rest of sequences using regular expression in python using re.search()
method.
I would recommend to use MEME.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
post the python code as well (put it into pre tags then it will be shown nicely formatted, see help on the right)