Hi all,
I have a file that has enhancers in 1st column and the name of transcription factor in 2nd column for which it has binding sites. I wanted to find out which enhancers have binding sites for common transcription factors so I made a heatmap in R but since my data is so huge its impossible to estimate the no. of TFs shared by a group of enhancers. How can I accomplish this in R? My data looks like this:
Enhancer TF
Gene1_Enhancer1 Arid3a
Gene1_Enhancer1 Hoxa4
Gene1_Enhnacer1 Ascl2
Gene1_Enhancer1 EBP
Gene1_Enhancer2 ETS1
Gene2_Enhancer1 ETS1
Gene2_Enhancer1 EBP
Gene2_Enhancer1 Arid3a
Gene2_Enhancer1 Hoxa4
Gene3_Enhancer1 Arid3a
Gene3_Enhancer1 Hoxa4
Gene3_Enhancer1 EBP
Gene3_Enhancer2 Hoxa7
Gene4_Enhancer1 Hoxa4
Gene4_Enhancer1 EBP
Gene4_Enhancer1 Arid3a
Is there a way I could have my output like this in a text file such that I have groups containing 1 or more enhancer from all 4 genes:
Group Common TFs
Gene1_Enhancer1, Gene2_Enhancer1, Arid3a, EBP, Hoxa4
Gene3_Enhancer1, Gene4_Enhancer1
Thanks a lot!!!
Thanks a lot. I tried this. It works well and finds the TFs common to all enhancers. I'm sorry I probably didn't make it clear. I have many enhancers from each gene like 45 say for each gene. I want to find groups of TFs that are present in groups of enhancers of all genes. For example apart from the above example there may be another group of enhancers within this huge set that shares entirely different TFs than this above group but nevertheless are similar to each other and so interesting for me. So I want to have all these different groups of enhancers with common TFs apart from TFs that are common to the entire set of enhancers which this function gives me. Is there any way to use this function for that? Thanks a lot!