Question

Dna Motif Discovery

0

Entering edit mode

12.1 years ago

k.nirmalraman ★ 1.1k

Dear Forum Members,

I am trying to discover potential motifs in 73k sequences of length 50 basepairs (expecting at least 10 motifs). But however, I am not certain of the expected Motifs to enrich my sequences.

I am afraid this would impact the motif discovery in MEME. Any suggestions on how to address this?

Thanks in advance!

meme motif • 3.7k views

ADD COMMENT • link updated 12.0 years ago by Ian 6.1k • written 12.1 years ago by k.nirmalraman ★ 1.1k

0

Entering edit mode

why do you think that using the MEME suite is not appropriate?

ADD REPLY • link 12.1 years ago by Istvan Albert 102k

0

Entering edit mode

Not that I m saying it is not appropriate, I am wondering if this large data set with 10 different motifs (expecting) would be okay...

or in other words is it okay to do such a MEME run? and how can I improve the results!

ADD REPLY • link 12.1 years ago by k.nirmalraman ★ 1.1k

score 1 · Answer 1 · 2012-11-06

1

Entering edit mode

12.1 years ago

Asaf 10k

MEME has an option to do a discriminative motif finding, you can generate a set of equally sized background sequences and find motifs that are present in the test set and not in the background, this should improve the specificity of the MEME run. Another issue is the search mode you use, MEME has 3 modes: zero or one motif per sequence (zoops), one per sequence (oops) and any number of repeats (anr), if you think the motif is not represented enough, choose zoops mode. Another issue you should take into account is the run size - the web MEME interface is limited to 60000 bp, you should probably install MEME locally to run this job.

ADD COMMENT • link 12.1 years ago by Asaf 10k

0

Entering edit mode

Hi Thanks much for your suggestion.

I made a local installation of MEME and was able to run a sample test. Now I have some concerns on how to run the test.

I have about 7500 sequences of about 50 bp length for motif discovery. I am interested in motifs that are centered around 15th and 40th position on the sequence. I am expecting somewhere around 6 to 7 motifs.

Can you help me understand, how can I generate the background sequences and perform a comparative MEME run, so I can figure out motifs in test set with better accuracy?

is there any significance for parameters like -bfile and -psp? how can I use them?

ADD REPLY • link 12.1 years ago by k.nirmalraman ★ 1.1k

0

Entering edit mode

The psp and bfile parameters allows you to direct MEME to the right motif (you should read about MEME to understand how it works and how these parameters influence). The background sequences should be as close as possible to the test sequences, if you used a script to generate the test sequences try to use the same script but this time choose random starting points or random genes.

ADD REPLY • link 12.1 years ago by Asaf 10k

score 0 · Answer 2 · 2012-11-07

0

Entering edit mode

12.1 years ago

Ian 6.1k

You could try Weeder (currently appear to be offline), the stand-alone version can take a make larger number of sequences and has a reasonable model for genome background.

EDIT: The Weeder website was updated without the author's knowing. Now operational/accessible.

ADD COMMENT • link 12.0 years ago by Ian 6.1k

0

Entering edit mode

Weeder is indefinitely offline since then :(

ADD REPLY • link 12.0 years ago by k.nirmalraman ★ 1.1k

0

Entering edit mode

Weeder is now accessible :)

ADD REPLY • link 12.0 years ago by Ian 6.1k

0

Entering edit mode

Oh great!! Thank you!! will check that out! :)

ADD REPLY • link 12.0 years ago by k.nirmalraman ★ 1.1k