I have performed an enrichment analysis over the promoters of genes that are differentially expressed between two conditions, and there were highly significant results. However, I have seen that random promoters also give significant enrichment results.
Hence, I want to doublecheck that the results I get are different from the enrichment signature of random promoters.
To get a random list of promoters :
I have downloaded the list of all human genes (GRCh38.p14) from Biomart. From these list, I have taken a subset 2000 genes (simply by taking the first 2000 genes by alphabet).
I have selected the promoters for these 2000 genes via EPD, with default settings (no options selected).
https://epd.expasy.org/epd/EPDnew_select.php
3264 promoters were selected, and I have exported a fasta file from -1000 to +100.
This was uploaded to SEA (Simple Enrichment Analysis) from meme-suite.
The results are here https://meme-suite.org/meme//opal-jobs/appSEA_5.5.517041990574241307765695/sea.html
For example
Upon visual expection, there is high concordance between my list, and the random list found motifs.
** If you know the tools, is the default parameters selection reasonable?
** How do I reasonably decide which motifs enriched in my own data are valid, and which are no better than the enrichment results for random genes?
As others have commented, this is an impressive example why background are critical in enrichment analysis. You're currently testing promoter vs genome which of course primarily returns bona fide promoter motifs. Here as background you could use promoters of genes with good evidence to be not differential.