I have an answer posted on the Bioinformatics SE that suggests how to use FIMO to scan for TFBS over a genome:
https://bioinformatics.stackexchange.com/a/2491/776
This example uses UCSC to retrieve sequence information, JASPAR for MEME-formatted motif models, and BEDOPS starch
to compress the results (which will be sizeable).
Once you have your whole-genome set of TFBSs, you can use set operations to look for binding sites within regions of interest, e.g. proximal promoters that could be defined as a given window upstream of each stranded gene's TSS.
Say you have generated your TFBS in a sorted BED5 or BED5+ file called TFBSs.bed
, and your gene TSSs are in TSSs.for.bed
and TSSs.rev.bed
, separated by strand.
You can then use BEDOPS bedops
with bedmap
to find TFBS in proximal promoters — e.g. for forward-stranded TSSs:
bedops --everything --range -100000:0 TSSs.for.bed \
| bedmap --echo --echo-map-id-uniq --delim '\t' - TFBSs.bed \
> answer.for.bed
The file answer.for.bed
will have your forward-stranded TSS windows (proximal promoters) and a listing of unique names of motif model associated with TFBSs that overlap the promoter.
For reverse-stranded gene TSSs, you just change the --range
argument:
bedops --everything --range 0:100000 TSSs.rev.bed \
| bedmap --echo --echo-map-id-uniq --delim '\t' - TFBSs.bed \
> answer.rev.bed
If you want everything in one file at the end, in sorted order:
bedops --everything answer.for.bed answer.rev.bed > answer.bed
Note that the above example uses premade motif models from JASPAR. This is different from scanning your promoters for putative or predicted binding motifs. For that, you could use MEME, instead of FIMO. Some discussion of the difference here with comments: https://bioinformatics.stackexchange.com/a/8692/776
There is a software tool called ROSE to find enhancers.
If you use R then you could probably use TFBStools. If I recall correctly from the time when I used it - you need a PWM file of the TF and the sequence of interest for you. It will scan the region and identify regions and assign a p-value per "hit"
Alternatively, https://bioconductor.org/packages/release/bioc/html/motifmatchr.html, which I found less tedious than TFBStools.