I finally contacted the author directly; here is his answer (reproduced with his permission):
Peak-motifs combines various approaches to detect exceptional motifs.
For oligo-analysis, dyad-analysis, and local-word-occurrences, the significance indeed results from a binomial test. The detail of the method has been described in the first article presenting oligo-analysis:
van Helden et al., J Mol Biol. 1998 Sep 4;281(5):827-42. http://www.ncbi.nlm.nih.gov/pubmed/9719638
For position-analysis, the significance is computed with a chi-squared test, as described in
van Helden et al. Nucleic Acids Res. 2000 Feb 15;28(4):1000-10. http://www.ncbi.nlm.nih.gov/pubmed/10648794
I summarize below the principle for oligo-analysis, but most concepts are similar in the other programs.
1) We estimate, for each word, a prior probability, i.e. the probability to find this particular word at a given position of the sequence.
In the 1998 publication, we called this prior probability "expected frequency", and it was estimated by measuring the frequency of the same word in some reference sequence set (e.g. all upstream sequences of the organism of interest). For peak-motifs, it is not easy to define a suitable reference set. The prior probability is thus computed using a Markov model, whose transition probabilities are estimated from the peak sequences themselves.
2) We use the binomial distribution to compute the p-value of a word, i.e. the probability to observe at least x occurrences of this word.
This is the so-called "nominal p-value", i.e. the p-value associated to this particular test. It provides an estimation of the false positive risk (FPR), i.e. the risk to consider a word as significant whereas it is not.
3) However, we have to take into account the fact that each analysis consists in several thousands of tests. Thus, if we set a too permissive threshold on p-value, the risk of false positives is challenged multiple times. For example, if we analyze heptanucleotides, a single run of oligo-analysis will consist in testing the over-representation of T=4^7=16,384 words. If we set the p-value threshold to 0.01, we expect 16,384 * 0.01 = 164 words by chance (this can be checked empirically with random sequences).
We thus have to apply a correction for multiple testing.
For this, we compute an e-value, i.e. the expected number of false positives corresponding to this p-value
e-value = p-value * nb.words
4) The significance index is simply a log-transformation of the e-value
sig = -log10(e-value)
The default threshold on significance is 0, i.e. we accept on the average 1 word to be selected as false positive per analysis.
The interpretation of the significance is relatively intuitive:
with a sig >= 0, we expect 1 FP per analysis
with a sig >= 1, we expect 1FP per 10 analyses
with a sig >= 5, we expect 1 FP every 10^5 analyses
...
In the peak-motifs report, we only display the significance score, but if you click on the links "Discovered words [text]" in the section "Discovered motifs (per algorothms)", you get the full detail (p-value, e-value, sig) for each word.