How to get all start and end positions of promoters in All the human genome ?
1
2
Entering edit mode
9.7 years ago

Hello,

I'm wondering how to get the start and the end positions of all promoters in all human chromosomes?

Thanks

Promoter • 9.5k views
ADD COMMENT
3
Entering edit mode

What is your definition of a promoter? A DNA motif, specific chromatin states, eg a region 1000bp upstream of the TSS?

ADD REPLY
0
Entering edit mode

Why does a Promoter have more than one definition?

ADD REPLY
1
Entering edit mode

Your are right, per se it's the place where the expression is regulated and mostly it is proximal to the transcription start site. But it can gene and tissue depended and more complex. So, one possibility would be to take -1000bp from the gene start to the TSS -1. Or you go further and use the Encode Data https://www.encodeproject.org/. They tried to find promoters using chromatin states, modified histones or proteins that interact with DNA specific for promoters.

ADD REPLY
0
Entering edit mode

if for example the start of a specific gene is 1345000 can we say that the promoter starts at 1344000 and ends at 1345000 for this gene?

ADD REPLY
2
Entering edit mode

No, you can only say that the promoter is probably within the segment.

ADD REPLY
0
Entering edit mode

Many thanks Jimbou, very useful informations.

ADD REPLY
8
Entering edit mode
9.7 years ago
Emily 24k

At Ensembl we've annotated promoters as part of our regulatory build (shiny new paper on it). These are based on segmentation data from ENCODE and RoadMap Epigenomics, finding consensus regions of promoter activity between cell types. This will be further refined as we add more cell types to the analysis (e.g. more from ENCODE and RoadMap and add in Blueprint).

You can access these annotations through the Ensembl Browser (here is one at the 5' end of a gene, where we expect it to be), BioMart (e.g. this query will get you all the predicted promoters on chromosome 21), the Ensembl APIs and the Ensembl FTP site.

ADD COMMENT
2
Entering edit mode

I do recommend that people read the paper and/or the documentation on how these were predicted. We don't have assayed evidence showing that these promoters will lead to gene expression when stuck on the end of a gene. What we have are ChIP-seq, DNase etc data that are indicative of promoter activity, ie it's very likely that if you stick it on the end of a gene with the right TFs it will be expressed, but no-one's tested to see if that's true.

ADD REPLY
0
Entering edit mode

Many thanks,

ADD REPLY
0
Entering edit mode

Wow, this looks like a great resource!

ADD REPLY
0
Entering edit mode

Biomart gives me start (bp), end (bp) and regulatory stable ID (ENSR#) . how should i find genes which these promoters belong to, and their ensembl gene ID?

ADD REPLY
0
Entering edit mode

You could try searching the gene database for genes upstream or downstream of the promoter. Ensembl make no inferences about which promoters belong to which genes, only their position.

ADD REPLY
0
Entering edit mode

thanks. i will try it

ADD REPLY
0
Entering edit mode

Hi Emily, when choosing Attributes for BioMart output, what is the difference between "Start" and "End" vs. "Bound start" and "Bound end"? In some cases, they seem to be the same but in many they are not

ADD REPLY
0
Entering edit mode

It's a bit of a throwback to how we used to calculate regulatory features, but it's still used on promoters, which have a core promoter and a flank.

ADD REPLY
0
Entering edit mode

Thank you, I appreciate the quick reply!

ADD REPLY

Login before adding your answer.

Traffic: 2929 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6