Question

Transcription factor gene targets

0

Entering edit mode

7.4 years ago

mforde84 ★ 1.4k

Hi,

I have a variety of TF ChIP for a cell line, and I'm looking to determine what genes are a target of the TF based upon IDR called peaks.

Do I just look for any peak within a n-kilobase window surrounding each genes TSS? Or is there a better way to do this (e.g., with hotspots)?

Thanks, Marty

transcription factor ChIP-Seq • 1.5k views

ADD COMMENT • link updated 7.4 years ago by Alex Reynolds 36k • written 7.4 years ago by mforde84 ★ 1.4k

score 2 · Accepted Answer · 2017-07-20

You could use BEDOPS bedops to map genes to peaks:

$ bedops --element-of 1 genes.bed peaks.bed > genes_overlapping_peaks.bed

Once you have those genes, use bedops --range to generate a proximal promoter regions for those genes, say 1kb upstream of the TSS.

$ awk '$6=="+"' | awk '{ print $1"\t"$2"\t"($2+1); }' | bedops --everything --range -1000:0 - > promoters.for.bed
$ awk '$6=="-"' | awk '{ print $1"\t"$3"\t"($3+1); }' | bedops --everything --range 0:1000 - > promoters.rev.bed
$ bedops --everything promoters.for.bed promoters.rev.bed > promoters.bed

Separately, locate putative TF binding sites and their positions with a tool like FIMO and some TF database (JASPAR, TRANSFAC, UniPROBE, Taipale, etc.) at some desired statistical threshold. You could do a set operation on these results with your TF-specific ChIP regions.

One you have promoter regions and TF binding sites, do a BEDOPS bedmap operation on these two sets:

$ bedmap --echo --echo-map-id-uniq promoters.bed TFs.bed > answer.bed

The file answer.bed will contain a list of promoters and the IDs or names of the transcription factors that bind to — "target" — those promoter regions.