I am interested in finding all known Transcription Factor Binding sites for a list of genes from the ENCODE dataset. How could one automate that? From the tables it appears that each TF has its own table. Thus, I could obtain promoters for my set of genes and find its intersection with the table of the TF in question. But the number of tables is rather large and the names do not follow any convention that I could discern. Is there a way to automate this process?
Hey Mikael, How to cite it if I do the intersect by these two datast?
I agree with Mikael, doing this using UCSC data would be easier locally, by downloading human ChIP at http://hgdownload.cse.ucsc.edu/downloads.html#human and using overlapSelect from the UCSC toolkit: http://hgdownload.cse.ucsc.edu/admin/exe/