Question

Confused about TSS coordinates

1

Entering edit mode

6.0 years ago

srhic ▴ 70

Hello,

I have a very basic question about the coordinates of TSS sites which I am a little confused about. I am trying to plot the ChIP signal in the regions 1000bp upstream of the TSS of certain genes. For this I need a bed file of the coordinates of these regions.

To make this bed file I simply subtracted 1000 from the start coordinate of my genes of interest. For example, if the coordinates of my gene were chr1: 3,073,253 3,074,322 I plotted the chip signal in the region 3,072,253 to 3,073,253. However, someone told me that this is incorrect because I am ignoring the fact that the TSS can be on the + or - strand. I have now downloaded the coordinates of my genes from ensemble including the TSS and strand info which look like this:

Chromosome Gene start (bp) Gene end (bp) Gene stable ID Transcription start site (TSS) Strand 17 32396778 32412926 ENSMUSG00000117872 32396778 1 17 32396778 32412926 ENSMUSG00000117872 32396825 1 17 94186 95088 ENSMUSG00000096776 95088 -1

edit: sorry I am not sure how get the formatting right for this.

Now I see that the gene start position is not always the TSS depending on the strand but my concepts are still a little unclear. If I subtract 1000bp from the TSS column and plot the signal in that region would that be the right way to do this?

Thanks

ChIP-Seq • 2.8k views

ADD COMMENT • link updated 4.3 years ago by Zhenpeng Yu ▴ 20 • written 6.0 years ago by srhic ▴ 70

score 1 · Answer 1 · 2019-07-07

1

Entering edit mode

6.0 years ago

WouterDeCoster 48k

You want 1000bp upstream of the gene, so for a gene on the positive strand you do genomic coordinate minus 1000, for genes on the negative strand you'll add 1000.

ADD COMMENT • link 6.0 years ago by WouterDeCoster 48k

score 0 · Answer 2 · 2021-03-17

0

Entering edit mode

4.3 years ago

Zhenpeng Yu ▴ 20

For genes in negative strand, column5 is the TSS. You can define promoter region as [column5-1000, column5+1000] for -ve strand. Use [column4-1000, column4+1000] as genes in +ve strand.

ADD COMMENT • link 4.3 years ago by Zhenpeng Yu ▴ 20