This may be overkill since you're only interested in one gene. If you have a bunch of BED files, one for each ChIP-seq experiment, you could put the appropriate TF name in the 4th column of a file, and use bedmap to map this information over to your region of interest. Example:
// in creb.bed
chr1 4012 4103 CREB
chr1 5500 5750 CREB
...
chrY 3100000 3100050 CREB
// in stat1.bed
chr1 500 625 STAT1
...
// similar files, one for every other TF
Define your region of interest in a BED file. Say you are looking at the CTCF gene.
// in myfave.bed
chr16 67596110 67596310 CTCF +
bedops -u creb1.bed stat1.bed ... | bedmap --echo --echo-map-id myfave.bed - > answer.bed
where you pass all/any number of TF files to the first bedops command. The bedmap command tells you which TFs are in your region of interest. If you only want distinct TFs (so that, for example, STAT1 doesn't show up 5 times), replace --echo-map-id
with --echo-map-id-uniq
. The output file would look something like:
// in answer.bed
chr16 67596310 67673088 CTCF +|STAT1;VDR
where the stuff after the '|' symbol includes the TF information you want (all separated by semicolons). Keep in mind that you can put as many rows as you want in myfave.bed so this approach scales as needed. The only other requirement is that all of these BED files be properly sorted.
The bedmap program has other useful things it can report (all in one pass of the data) - perhaps you want the binding sites themselves and not just the IDs (see --echo-map
). There are also multiple ways to specify what it means for an element to lay within your region of interest (ie; should a peak call have at least 50% of its genomic length lying inside what you define to be the promoter region? See --fraction-map
and related overlap constraint options).
Hi,
I have bunch of .bam files for chip seq experiments from ENCODE. I converted them to .bed files using bamToBed utility. I created a bed file of region of my interest and used your command.
But my output did not showed anything after pipe
Am I doing something wrong here??
Thanks