Hi,
I have a list of genes and I want to know the location of the promoters of these genes ('start' and 'end' position). Are there any solutions for that?
Best
Hi,
I have a list of genes and I want to know the location of the promoters of these genes ('start' and 'end' position). Are there any solutions for that?
Best
You don't mention the organism, but if it is a model one, try biomart
After accessing to the sequence retrieval options, you can choose almost everything from that genome, including the putative promotors
There are two ways to get a promoter. Either define it as an arbitrary region from standard gene annotations (e.g. - 1000 + 500 or whatever you prefer - 3000 + 1000 etc), this can be done as described by Antonio with
getPromoterSeq(query, subject, upstream=2000, downstream=200, ...)
Quickest is probably Biomart, it has an easy filtering option where you can put in your list of genes.
You can also retrieve promoters from other data like Giovanni suggested. With CAGE you basically measure the 5' capped site, so the TSSs. There you can also derive your promoter regions, as far same as I understand you have to set your cutoffs yourself.
Second option is using a promoter database e.g. EPD
Here you have promoter regions based on a set of criteria, motifs, islands, epigenetic marks. Probably a better way http://epd.vital-it.ch/human/human_database.php
Biomart is a good resource. An alternative is the FANTOM5 data, which provides the position of each promoter of each gene in each tissue, derived with the CAGE technology.
Another possibility is to manipulate a putative gff or gft file with the help of R. Just in case that neither Biomart nor Fantom5 contains your data of interest and a gff or gft file is available
Eukaryotic promoter database provides experimentally validated promoters.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Many thanks Antonio,