Entering edit mode
7.9 years ago
arunprasanna83
▴
60
Hello,
I am trying to do motif discovery for ~22000 promoter sequences each with length 1000 bp. I have a local installation of MEME. I tried to account -maxw with 23000000, but MEME exits without any warning. Can you give idea about how to run these sequences ? Also can anyone explain me about how to choose -nmotifs value ?
Thanks in advance,
AP
That length is very large and wouldn't really give you any use-able information anyway. I suggest you cut down the length to about -50/+50 from the TSS or -100/+100 or something much smaller.
You may also give DREME a shot since it's supposed to be used on larger datasets. HOMER has a function to find motifs at promoters as well that is generally pretty good.
Hi, Thanks for your reply. What if, I divide 22k sequences in 22 batches with 1000 sequences each ? Will the predictions differ in each batch due to background sequence composition effect ? Also, how do people decide on nmotifs ? i.e number of motif to find in sequence ? In literature people show the values to be 3 or 5 but can't find the logic.