You bring up a nice question about the -cpg
option (in HOMER).
I there there are some common GC-rich motifs that HOMER tries to place less emphasis on. However, you specifically want to know about the CpG sites within the region (and sequence outside CpG motifs would not be as relevant).
If you are going to try and do some motif analysis, I think you would need to be able to find differential methylation across CpG-overlapping motifs at multiple genes.
So, you probably need to get site-centric transcription factor binding information. I think ENCODE would be a popular, but I can't precisely vouch for it's use in this application.
If you have that, I can think of at least two possibilities (methods-wise):
1) I believe methylSig performs such a test with ENCODE Motifs (section 6.3 in the linked PDF)
2) If you use an annotation-based strategy that doesn't take proximity into consideration, you could also do something similar (with your own set of annotations). Again, I don't have a success story about defining the "region" as CpG sites within motifs in different genes, but you can use the default COHCAP region analysis (without re-defining boundaries) to test this sort of thing (with the extra effort of defining your own "custom" annotation with platform="custom"
in COHCAP.annotate()
)
I am also having the same question, as I am trying to develop a workflow from DMR multiple sequence alignment and clustering to de novo motif discovery and enrichment.