Hi
I have a series of modules, comprised of lists of gene symbols. I want to have a look at transcription factors that may regulate each of these lists of genes, i.e. control the expression of. So far I have identified places where I can get lists of genes per transcription factor like from the ENCODE data, but I have noticed these lists tend to be very long and include 50% of the genome or even more than that. For example, either the GTRD database or this ENCODE list have this problem. I was planning on doing a series of hyper geometric tests to assess the overlap between each TF and my modules list. Can anyone tell me a better way of doing this, or if this is even a sensible approach? Is the quality of the results likely to be reliable. I am working with human diseaseed tissue expression data.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210645/
http://amp.pharm.mssm.edu/Harmonizome/dataset/ENCODE+Transcription+Factor+Targets
Thanks,
Chris
Enigma was last updated in 2008. I would suggest using it.
True, they are not actively maintained anymore.
That does however not mean they can't be useful though ;) but I acknowledge they might indeed be under-performing