Hello,
this is potentially a very broad (and maybe a bit subjective) question, if you want I agree to treat it as a community-wiky. I am searching for TFBSs on a list of putative regulatory sequences from the Xenopus genome. I am using MEME Suite's FIMO. The problem is that, despite the strong support from the presence of the TFBS arising from our wet-lab experience, I am not able to obtain FIMO q-values under the desired threshold (0.05).
In general, what are good strategies to strengthen the validity of the data used in such experiments?
What comes to my mind is:
a) use data with a robust biological background;
b) use as short as possible strings;
c) carefully mask undesired signals;
d) use a PSWM as robust as possible;
e) set the algorithm's parameters properly.
Are you doing a de novo search for the motifs (i.e. are you trying to define the motifs using the sequences) or are you scanning for binding sites using a pre-defined set of motifs?
FIMO scans for binding sites using a pre-defined set of motifs.
I suspect the specific software used to perform the search and/or derive the motifs would also have an effect here.
I'm not sure if any of the JASPAR or TransFac folks are on BioStar, I suspect they will have suggestions relating to this. So you may want to consider contacting them asking for advise (and mention this question of course).
Thank you, I will try to get in contact with some of them mentioning this question. If I find some useful information, I will report it here.