what does the transcription factor location tell me
2
3
Entering edit mode
10.3 years ago
Affan ▴ 310

I am currently reading about transcription factors. I come from a pure math background so please try to make this easy for me. Anyways, I am reading a paper that talks about the Sp1 binding site locations (and does some statistics on it). I took the locations and put it up on the genome browser to see how it affects a gene. Here is my picture:

The custom user track is the SP1 transcription factor binding site (TFBS). So my questions are

  1. Does the location of the TFBS tell us something about the promoter/enhancer regions of this gene?
  2. Does it tell us anything about histone modifications?
  3. (aside) why are the refseq and ucsc genes not aligned? I thought the ucsc genes included refseq genes)
genome transcription-factors • 4.7k views
ADD COMMENT
3
Entering edit mode
10.3 years ago

I think what you need to understand is that transcription factor genes like SP1 produce transcription factor proteins that can then bind to the transcription factor binding sites of other genes which are often in the promotor region of those other genes. (Some transcription factors may be self regulatory and thus have a binding site in their own promotor region).

Answering your questions from that perspective:

  1. Yes. If you know the location of any gene you have a clear indication of where the promotor region is, just in front of it. (That is matter of definition actually, sometimes promotors are considered part of the gene). But you might really want to ask "do I now also know the regions where the transcription factor binds?". No for that you would need the binding motif and then scan the genome where that occurs. Nowadays you could use experimental data to see where the transcription factor really binds in different tissues, e.g. from ENCODE.
  2. No, it does not tell you about histone modifications. But if you know from epigenetics analysis where these occur you can see whether that is indeed in this region.
  3. The difference between the 2 genes probably also is a matter of what exact region the two resources consider to belong to a gene (which should be in the provenance descriptions for the gene annotations). They are just 100 bp apart in this case.
ADD COMMENT
0
Entering edit mode

Ahh, yes sorry about that confusion. The paper I am reading about talks about the binding sites of the Sp1 transcription factor. So the custom track on the browser corresponds to this binding region (which I am interested in, along with associated histone modifications). I've also edited my post to reflect this.

So I guess that changes your answer to 1) a bit right? The custom track does indeed give me the binding motif in the genome. Correct?

ADD REPLY
0
Entering edit mode

Actually you got me more confused. SP1 s a zinc finger transcription factor that binds to GC-rich motifs of many promoters. So if you indeed did what you said and looked for a specific binding region from that paper you should be looking at one of these. Note that for regions it is important that you use the same genome build for your search as was used in the paper.

ADD REPLY
1
Entering edit mode

Thankyou. I zoomed into the TF binding region in the browser and saw this. http://i.imgur.com/otMsxlN.png It does look like a GC rich motif, is that correct?

ADD REPLY
0
Entering edit mode

Yes. It also fits the TF binding motif for SP1 which is "5'-(G/T)GGGCGG(G/A)(G/A)(C/T)-3'" see https://en.wikipedia.org/wiki/Sp1_transcription_factor

ADD REPLY
0
Entering edit mode

Late comment, but I was looking over this data again and I found this binding site.

C G G C C C C G C C C T T

This doesn't seem to fit the consensus sequence. Can it be due to mutations?

ADD REPLY
0
Entering edit mode

http://jaspar.genereg.net/cgi-bin/jaspar_db.pl?ID=MA0079.2&rm=present&collection=CORE

Remember, TFs typically have a degenerate motif, so the site for one specific gene may or may not be in line with the consensus sequence. If you're worried, check your site against any published binding site data.

ADD REPLY
1
Entering edit mode
10.3 years ago
pld 5.1k

I did some work a while ago (in bacteria) using LexA, a self regulating TF, to extract binding motifs using a comparative genomics approach. Basically if one can reasonably assume that the TF self regulates, there will be a binding site for that TF upstream of its gene.

So you gather a number of related species and identify the TF orthologs and then use MEME to mine promoters for a binding motif.

You could presumably do this work if there were enough copies in of your TF's gene either in a single species, or related species (for humans you could use primates).

However, that is only if you don't know what genes your TF is acting on, if you have a list of genes known to be regulated by your TF, you can hit the promoters of those genes with MEME or some other motif discovery tool.

ADD COMMENT
0
Entering edit mode

Couple of quick questions: Where can I learn more about comparative genomics. What is comparative genomics? And what does it mean for a TF to be self regulating (I understand this may be a broad question)

ADD REPLY
0
Entering edit mode

Start here: https://en.wikipedia.org/wiki/Comparative_genomics and watch a few YouTube videos on the topic to start with. Many traditional bioinformatics books are in fact written from a comparative genomics perspective.

I tried to explain self-regulating in my answer to this question. Transcription factors often have a binding site in their own promotor region and thereby can influence their own transcription (most often negatively as a feedback loop).

ADD REPLY

Login before adding your answer.

Traffic: 1731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6