What'S The Difference Between Cds And Orf?
7
39
Entering edit mode
12.5 years ago
shiy05 ▴ 370

What's the difference between the terms CDS and ORF?

cds orf • 188k views
ADD COMMENT
0
Entering edit mode

Is this the first citation for a biostars post?

ADD REPLY
75
Entering edit mode
12.5 years ago
Gjain 5.8k

Hi,

In more details:

ORFs:

The region of the nucleotide sequences from the start codon (ATG) to the stop codon is called the Open Reading frame.

Gene finding in organism specially prokaryotes starts form searching for an open reading frames (ORF). An ORF is a sequence of DNA that starts with start codon “ATG” (not always) and ends with any of the three termination codons (TAA, TAG, TGA). Depending on the starting point, there are six possible ways (three on forward strand and three on complementary strand) of translating any nucleotide sequence into amino acid sequence according to the genetic code .These are called reading frames.

While eukaryotic gene finding is altogether a different task as the eukaryotic genes are not continuous and interrupted by intervening noncoding sequences called ‘introns’. Moreover organization of genetic information in eukaryotes and prokaryotes is different.

CDS:

The Coding Sequence (CDS) is the actual region of DNA that is translated to form proteins. While the ORF may contain introns as well, the CDS refers to those nucleotides(concatenated exons) that can be divided into codons which are actually translated into amino acids by the ribosomal translation machinery.

Mainly: CDS means only that the sequence is known to be transcribed and, therefore, it is coding for something -- neither gene nor protein has to be known. Any full mRNA sequence (obtained from cDNA sequencing) will have a full coding sequence. ORF is usually predicted based on DNA sequence and not proven to be transcribed.

Sources:

ADD COMMENT
1
Entering edit mode

I like your graphics;)

ADD REPLY
1
Entering edit mode

Thanks, I thought some visuals would be nice.

ADD REPLY
0
Entering edit mode

The image is not accessible anymore, any chance to uploading another version to another hosting service perhaps? This is a quite popular post.

ADD REPLY
0
Entering edit mode

Thanks for bringing this up. I have updated the image. This should be permenant.

ADD REPLY
0
Entering edit mode

thanks, this post is one of the most accessed ones on Biostar, and most likely thanks to that the image

ADD REPLY
0
Entering edit mode

You're welcome... I am happy to contribute.

ADD REPLY
0
Entering edit mode

There is a lot of evidence of non-canonical translation that begins at non-AUG sense codons. For example, translation may begin at CUG, GUG or ACG (see http://www.sciencedirect.com/science/article/pii/0378111990900856). It is therefore more meaningful to define ORFs as stop to stop (rather than start to stop).

ADD REPLY
5
Entering edit mode
12.5 years ago
Dave Lunt ★ 2.0k

ORF (Open Reading Frame) is best seen as a hypothesis of a protein coding region. It is the stretch of DNA between a start codon and the next stop codon. It is not a hypothesis of the whole protein coding region in eukaryotes (due to introns). CDS should be the whole coding region.

Both those start/stop 'codons' could be just randomly found in an intergenic region that does not actually code for any protein- so not every ORF means a protein. An ORF will be found between the actual start codon of a protein coding gene and the next stop codon. It is quite possible that this stop codon will be found in an intron, in which case the ORF includes an exon and part of an intron. Since introns are mostly just random sequence a stop codon could just occur by chance. If the intron by chance does not contain a stop 'codon' (ie 3 nucleotides TAA/TAG/TGA in the same reading frame as the exon) then the ORF will continue until it meets a stop codon- either randomly in the next intron, else a genuine stop at the end of the gene.

If the intron without a stop is not a multiple of 3 nucleotides, then it will introduce a frameshift, and the next stop could easily occur within the next exon. If it is a multiple of 3 it will introduce false amino acids into the ORF as it continues through the intron and into the exon. These sorts of errors are not uncommon in gene annotation, since intron detection is complex, and if it 'reads through' the intron might not be annotated until cDNA sequences are compared to the genome sequence.

If you want to see a demonstration of these ideas try getting a sequence from GenBank for a gene that contains a leader sequence 5'-UTR, exons, introns, 3'UTR. The CDS will be annotated as such and will just be exonic regions. Take this gene sequence and use NCBI ORF-Finder which will outline all the potential ORFs. Some of these, but not all, will be the actual coding parts.

ADD COMMENT
3
Entering edit mode
12.5 years ago
Leszek 4.2k

CDS - coding dna sequence - > only sequence that is translated into protein
ORF - open reading frame -> entire gene sequence 5'-utr + transcript (all exons + introns) + 3'-utr

ADD COMMENT
5
Entering edit mode

CDS is right, but ORF is wrong - Gjain's definition below is correct: an ORF is just a nucleotide sequence from a start to a stop codon.

ADD REPLY
1
Entering edit mode

I like the brevity of your answer.

ADD REPLY
0
Entering edit mode

an ORF is the part of the mRNA sequence, starting at an intiation codon (usually AUG), that terminates either at a stop codon (TAA, TAG or TGA for the standard genetic code), or at the end of the sequence, if no stop codon is found in the same phase; the later case meaning that the mRNA sequence is incomplete. Usually, the AUG codon is embedded in a longer less defined sequence (for example, Kozak sequence for vertebrates).

ADD REPLY
3
Entering edit mode
12.5 years ago
Christian ★ 3.1k

I would define an open reading frame (ORF) as any stretch of nucleotide sequence from start to top codon (coding or not coding for protein), whereas a coding sequence (CDS) is a nucleotide sequence that is believed to code for protein. A CDS can correspond to an individual exon of a protein-coding gene or represent the complete (spliced) sequence of a protein-coding transcript.

ADD COMMENT
2
Entering edit mode
4.7 years ago

Multiple genes can be encoded in a single reading frame of prokariotes

Therefore, besides intron removal which was mentioned in this other answer, this is another important difference between what actually gets transcribed (Orf) and translated (Cds), and therefore further motivates their distinction.

Such open reading frames are called "multicistronic" and are described for example in this article: https://blog.addgene.org/plasmids-101-multicistronic-vectors That article mentions two mechanisms by which this can work:

Viruses (notably positive RNA ones) also have techniques to allow a single mRNA to be translated to multiple proteins, this is mentioned for example on this presentation about the COVID-19 virus.

ADD COMMENT
1
Entering edit mode
6.0 years ago
jazzyl5660 ▴ 10

An easy sample from Wiki to understand difference between ORF and CDS. Sample sequence showing three different possible reading frames. Start codons are highlighted in purple, and stop codons are highlighted in red.

enter image description here

ADD COMMENT
0
Entering edit mode
3.6 years ago
kajumi • 0

Can CDS contain sequences that aren't exons ? I'm asking because i found a cds that is longer than joined exons.

ADD COMMENT

Login before adding your answer.

Traffic: 1854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6