Differential gene expression based on "exon" or "CDS"?
1
2
Entering edit mode
7.1 years ago
SMILE ▴ 190

Hi all,

When we talk about differential gene expression analysis using RNA-seq data, what we actually evaluate is the expression of "exon" or "CDS"?

enter image description here

gene RNA-Seq • 6.8k views
ADD COMMENT
1
Entering edit mode

Exons. Differential expression analysis utilizes gene annotation file which contains full length transcripts, including 5' and 3' UTRs.

ADD REPLY
0
Entering edit mode

Since in the annotation gtf file I see "exon" and "CDS". I thought edfferential expressed gene analysis could be based on CDS, which is more meaningful in my understanding

ADD REPLY
1
Entering edit mode
7.1 years ago

Exons, since that's really what RNA is composed of.

ADD COMMENT
0
Entering edit mode

Since in the annotation gtf file I see "exon" and "CDS".

chr1 unknown exon 3214482 3216968 …...

chr1 unknown stop_codon 3216022 3216024 …...

chr1 unknown CDS 3216025 3216968 …...

chr1 unknown CDS 3421702 3421901 …...

chr1 unknown exon 3421702 3421901 …...

Can I use GTF.featureType="CDS" to test the differential expressed genes based on CDS? Is this acceptable? Does this behavior have big flaws? (In my underanding, CDS is more meaningful, since differential expression at the CDS level indicates potentially different protein outputs...)

ADD REPLY
0
Entering edit mode

You should use exons not CDS. As Devon explained RNA is composed of exons. Only a subset of these exons are coding.

ADD REPLY
0
Entering edit mode

I am not sure "RNA is composed of exons" is true after reading this paper. I don't want to be rude. I just want to know whether my understanding is right.

ADD REPLY
0
Entering edit mode

Please brush up on your understanding of the central dogma of molecular biology.

ADD REPLY
0
Entering edit mode

The biology is too complex...

I know that the eukaryotic genomes are pervasively transcribed, and that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and many biological processes utilize and require noncoding RNAs.

But when we talk about measuring gene expression level, isn't it based on the assumption that more abundant genes/transcripts are more important and that gene expression levels correspond to protein levels? Since comprehensive protein level measurement is hard, we measure gene expression level based on mRNA instead. And be selectively blind to the fact that sometimes "mRNA level"≠"Protein level".

After reading this paper RNA-seq data analysis at the gene and CDS levels provides a comprehensive view of transcriptome responses induced by 4-hydroxynonenal I found it reasonable to measure CDS.

Please correct me if my understanding is wrong.

ADD REPLY
0
Entering edit mode

If you want to work with bioinformatics then the phrase "the biology is too complex" needs to not be in your vocabulary.

ADD REPLY
0
Entering edit mode

Sorry about the misunderstanding caused by my poor English, I am not meaning bad.

What I mean when I say biology is too complex, is that when we analyze some biology problems, sometimes we have to do some compromise. For example, when we talk about measuring gene expression level, it is based on the assumption that more abundant genes/transcripts are more important and that gene expression levels correspond to protein levels (please correct me if my understanding is wrong). And be selectively blind to the fact that sometimes "mRNA level"≠"Protein level".

After reading this paper RNA-seq data analysis at the gene and CDS levels provides a comprehensive view of transcriptome responses induced by 4-hydroxynonenal I found it reasonable to measure CDS.

So I ask this question. ^^

ADD REPLY
0
Entering edit mode

You're misunderstanding what that paper did. They didn't do "differential expression based on CDS", they actually did differential isoform usage after collapsing isoforms with compatible CDSs. That's a completely different thing. "Differential expression" by itself always uses exons. Anything else needs different terms and will use very different tools (namely, for differential isoform usage or differential CDS usage one would use salmon or kallisto rather than something like featureCounts/STAR). The wording used in the paper you referenced is terrible, the referees should have had that fixed.

ADD REPLY
0
Entering edit mode

I don't agree with what you said about differntial isform/CDS usage. Differential isoform expression (DIE) and differential isoform usage (DIU) are related but distinct concepts. DIE assesses the difference of absolute expression in isoform level. In contrast, DIU assesses the difference of relative expression in isoform level. This is because a gene may have higher or lower expression overall, and it may also switch the usage of some RNA isoforms. For example, a gene may predominately use one isoform in one tissue and switch to another isoform in another tissue. Such relative expression of an RNA isoform is referred to the isoform usage. Reference1 Reference2 Reference3

ADD REPLY

Login before adding your answer.

Traffic: 1377 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6