I recently found ReMap which is a great resource of TF binding sites across different studies. They provide the binding sites in BED format where each region is named based on the TF. Although it's relatively trivial to overlap two BED files (using something like bedtools fisher
or bedtools intersect
), that assumes that you're just overlapping two sets of regions. Is there a way to generate the results separated by region name? I could split by TF and process each one separately in a giant loop, but that seems inefficient.
Thanks for confirming. I just wanted to make sure I am not reinventing the wheel or missing something.