List Of Tfbss For Each Gene In The Human Genome
2
2
Entering edit mode
12.6 years ago
Biostar User ▴ 360

How can I get a list of transcription factor binding sites from ENCODE or similar for each gene in the human genome in a table similar to this:

gene1, chrid, start, end, tf1, tfbs1-start, tfbs1-end
gene1, chrid, start, end, tf1, tfbs2-start, tfbs2-end
gene1, chrid, start, end, tf2, tfbs1-start, tfbs1-end
gene1, chrid, start, end, tf3, tfbs1-start, tfbs1-end
gene1, chrid, start, end, tf3, tfbs2-start, tfbs2-end
gene1, chrid, start, end, tf3, tfbs3-start, tfbs3-end
gene2...
encode transcription • 2.8k views
ADD COMMENT
2
Entering edit mode
12.6 years ago

You can use the MEME suite to find transcription factor binding motifs. I used Uniprobe to download my binding motif file. Then I used FIMO (in the MEME suite) to find these motifs across across the genome. There are several outputs to choose from for FIMO and I think you might want GFF3.

MEME

Uniprobe

ADD COMMENT
0
Entering edit mode
12.6 years ago
Frenkiboy ▴ 260

It depends on how tech savvy you are...if you have basic linux skills,

You can download the called peaks for each of the TFs in the encode database using wget, and overlap them with the designated promoter/exon intron of each gene using bedTools (in linux) or GenomicRanges (in R).

The other question is whether you would like for the list to be tissue specific or not. If yes, this requires a bit more tweaking, but is easily done in R.

SYDH peaks worked really well for me, but for HAiB, I downloaded the BAM files, merged the replicates, and called my own set of peaks using MACS.

ADD COMMENT

Login before adding your answer.

Traffic: 2674 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6