Hello, I need to do a motif density plot of the whole genome, as we have a subset of 400 genes where we expect a higher motif density than in the whole genome.
I thought that it would make sense to use motif scanning tools for this. I decided it might make sense to go with 1500bp upstream and the first 500bp of a gene (inspired by how interferome.org does their TF analysis).
HOMER and FIMO are both tools that can do motif scanning and both appear to be in use. Is there any discussion on which tool is actually better? In the posts I found, people explained what tool they went with but they would never explain the reasoning behind their choice. How can I make an informed decision? The only clue that I have is the latest releases - which is 2019 for HOMER and 2023 for MEME.
Secondly, (in the case of FIMO) I wondered about the choice of bgfile. There is good documentation on the available choices (https://meme-suite.org/meme/doc/fimo.html), but there is no discussion on what I am supposed to used. Intuitively, using the whole genome as a background (with order five) seems reasonable to me (https://biology.stackexchange.com/questions/41714/why-are-fifth-order-markov-models-the-ones-most-often-used-for-gene-prediction), there is also the suggestion to use the sequences themselves (Best Practices For Using Fimo For Motif Scanning) Also I wondered if I have to perform the analysis on a per-chromosome basis (as outlined here: https://bioinformatics.stackexchange.com/questions/2467/where-to-download-jaspar-tfbs-motif-bed-file/2491#2491) or whether I can use a single background model for all my sequences. Also, as I want to analyse both the density for all the genes and my subset of genes, do I need different background here?
Thirdly, FIMO seems to very hesitant with whole genome scanning overall("Does it make sense to scan all the promoters in a genome with transcription factor motifs using FIMO?"; https://meme-suite.org/meme/doc/fimo-tutorial.html) whereas this is not discussed in HOMER (http://homer.ucsd.edu/homer/motif/genomeWideMotifScan.html). So I wondered - Is my approach fundamentally flawed anyway, are there different tools/approaches to get the motif density?
Hi mb,
I am looking in the same direction. Did you get any clue on how to proceed with HOMER and FIMO to identify TF binding sites in promoters?
Thank you!
I don't want to discourage you but motif finding is nice in theory but doesn't work in practice. You can try it with known motifs and their target genes and see for yourself.
I've found FIMO to have good negative predictive value (when comparing to ChIP).