I am looking to generate a consistent background file that can be used by homer for motif finding. Does anyone have recommendations for how to go about doing this?
Each time motif finding is performed with homer it will generate its own background file. This process results in inconsistent outputs.
There have been a number of posts regarding this or similar questions, but not much clarity has come about.
Any information on how to go about accomplishing this would be greatly appreciated.
how to set a appropriate background file when using HOMER findMotifGenome.pl
Every time the results of findMotifGenome.pl in HOMER is different
Using Homer Programs For Denovo Motif : Different Results Among Its Own Versions
Thanks for the response and the input.
I have definitely taken a look at this before (I have tried my best to digest the extensive ~300 of so pg. homer documentation) , but I will say that it still leaves me with a fair amount of uncertainty about what the best approach would be to take.
I am working with enriched genomic regions that have been identified as differential between two sets of triplicates. I am not looking for large differences between the two groups, but am really interesting in what is just a few regions depending on the comparison of interest. I samples are really similar, which based on the experimental design, they should be for the most part. This would lead me to interpret the the practical tips as - select pretty much an sequence that is shared, has representative GC content, and the appropriate quantity to be sufficient for background. Still though, going about doing that properly is a task within itself, and even if done successfully and appropriately, I would be hard pressed to think it any stronger than that generated by homer. The only difference would be at least it would be consistent and this is not even necessarily an advantage because it is clear that altering the background sequence will yield differences in the motifs identified. Additionally, I think this effect is further amplified if the number of sequences being analyzed in low.
I too am hoping to do some comparisons across motif finding programs to increase confidence in identified motifs, but anticipate running into similar issues.
Thank you for the linked paper, I will take a look!