Dear all,
I run MEME using the command meme fasta_file -dna -w 20 -o output
. In the output folder, the meme.html file shows no motifs were discovered. However, the meme.txt file clearly shows there are many motifs with very low p-values, and inside the output folder there is a figure illustrating the top motif. What's the problem with my command?
Thank you very much! I appreciate any of your comments.
Hi, eromasko, thank you very much for your detailed answer! Your description is really helpful. I missed
-nmotifs
option. Now I think I can handle MEME. I just have one more question: you say "significant", so how to evaluate significance (e.g. what's the p-value cutoff)? Thanks again.Actually, I should have put E-value earlier instead of p-value, as that is what is returned. The significant cutoff is value is still less than 0.05. Here is a short description of the E-value in the HTML output of MEME (also, you can find more information within the manual on the link I posted earlier):
Hi, eromasko, really thank you for your guidance. I understand E-value is important. I am trying MEME now with different settings. Although I am still having problems (for example, if dataset is large, I need to set
-maxsize
option), let me take a try. THANKS!I have some experience with doing some semi-large datasets and have used
-maxsize
up to 500000. It starts to get really computationally- and time-intensive. For example, I had datasets of almost 500 sequences that were each 1000nt long and it starts to drag on, especially when you start considering multiple options like-nmotifs
,-w
,-minw
,-maxw
and-mod anr
. To make my life easier, I started writing simple bash shell scripts in order to run the many iterations of commands and options overnight and during long time stretches so I wouldn't have to be there to start the next command when the previous one finished. Hopefully that could help you if you already aren't doing something similar. Good luck!Yes, I agree to test different options using shell script, and see which combinations are desirable. The options you provided are helpful to me. Thanks a lot! --Biolab