How To Get Promoter Sequences For Non-Model Organisms?
2
0
Entering edit mode
13.7 years ago
Dejian ★ 1.3k

Model species are well researched and many tools are available. For example, many methods can be used to get the promoter sequences from human being. What if from a non-model organism? This problem is becoming urgent since many new genomes are available.

ADDED: Maybe I should clarify the situation. The genome is newly sequenced by our group, so its sequences are available for us. But recently the sequences are not publicly available, so BioMart cannot help. We are annotating the genome. The CDS can be predicted. However, I have no idea about how to determine the promoter region for each gene.

promoter prediction non • 5.2k views
ADD COMMENT
4
Entering edit mode

Your question is pretty vague. What exactly do you mean by the promoter sequence? And which genomes are you referring to? Have genes been annotated on these genomes? Are these genomes already available in a genome browser (e.g. Ensembl, Ensembl Genomes or the UCSC Genome Browser)?

ADD REPLY
0
Entering edit mode

Have tried the fantastic ensEMBL APIs (perl/biomart-perl)?

ADD REPLY
0
Entering edit mode

'Model'-organism is irrelevant, the only things that's important are is the genome sequenced and is it annotated. If the the sequence is available, it's available, if not not ;)

ADD REPLY
0
Entering edit mode

Is the organism a bacterium, archaeon or eukaryote? That will affect the choice of software and the type of promoter region.

ADD REPLY
2
Entering edit mode
13.7 years ago

BioMart is a very powerful tool for many extraction tasks. If your genome of interest - despite not being a model organism - is included in Ensembl, Ensembl Bacteria, Ensembl Metazoa, Ensembl Protists, Ensembl, Plants, or Ensembl Fungi simply go to the corresponding BioMart. Through the web interface you can easily retrieve a FASTA file with the 5' flanking sequence of every annotated gene.

If you have an annotated genome that is not yet public, you will need to do some basic scripting to retrieve what you need. It should not be hard, but generally since every project organizes data differently, you cannot rely on there being an existing tool that you can just run to do the extraction.

And if the genome has not yet been annotated, you will have to do gene prediction first, which is a task in its own right. Without knowing where the genes are, you cannot extract their putative promoter regions.

ADD COMMENT
1
Entering edit mode
13.7 years ago
Michael 55k

I assume now your organism is a bacterium:

Bacterial promoters are normally located relatively closely upstream of the CDS. They are characterized via sequence motives that allow binding of different sigma factors to the DNA in order to initiate transcription. Different promoter motives are specific for different families of sigma factors (e.g. sigma70, sigma54 are the most common ones). Promoters specific for other sigma factors might be more variable and harder to detect, your organism might also contain new sigma factors.

To give an overview I found this list of tools for bacterial promoter prediction: http://molbiol-tools.ca/Promoters.htm

ADD COMMENT

Login before adding your answer.

Traffic: 1657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6