Hi all,
When we talk about differential gene expression analysis using RNA-seq data, what we actually evaluate is the expression of "exon" or "CDS"?
Hi all,
When we talk about differential gene expression analysis using RNA-seq data, what we actually evaluate is the expression of "exon" or "CDS"?
Exons, since that's really what RNA is composed of.
Since in the annotation gtf file I see "exon" and "CDS".
chr1 unknown exon 3214482 3216968 …...
chr1 unknown stop_codon 3216022 3216024 …...
chr1 unknown CDS 3216025 3216968 …...
chr1 unknown CDS 3421702 3421901 …...
chr1 unknown exon 3421702 3421901 …...
Can I use GTF.featureType="CDS" to test the differential expressed genes based on CDS? Is this acceptable? Does this behavior have big flaws? (In my underanding, CDS is more meaningful, since differential expression at the CDS level indicates potentially different protein outputs...)
I am not sure "RNA is composed of exons" is true after reading this paper. I don't want to be rude. I just want to know whether my understanding is right.
The biology is too complex...
I know that the eukaryotic genomes are pervasively transcribed, and that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and many biological processes utilize and require noncoding RNAs.
But when we talk about measuring gene expression level, isn't it based on the assumption that more abundant genes/transcripts are more important and that gene expression levels correspond to protein levels? Since comprehensive protein level measurement is hard, we measure gene expression level based on mRNA instead. And be selectively blind to the fact that sometimes "mRNA level"≠"Protein level".
After reading this paper RNA-seq data analysis at the gene and CDS levels provides a comprehensive view of transcriptome responses induced by 4-hydroxynonenal I found it reasonable to measure CDS.
Please correct me if my understanding is wrong.
Sorry about the misunderstanding caused by my poor English, I am not meaning bad.
What I mean when I say biology is too complex, is that when we analyze some biology problems, sometimes we have to do some compromise. For example, when we talk about measuring gene expression level, it is based on the assumption that more abundant genes/transcripts are more important and that gene expression levels correspond to protein levels (please correct me if my understanding is wrong). And be selectively blind to the fact that sometimes "mRNA level"≠"Protein level".
After reading this paper RNA-seq data analysis at the gene and CDS levels provides a comprehensive view of transcriptome responses induced by 4-hydroxynonenal I found it reasonable to measure CDS.
So I ask this question. ^^
You're misunderstanding what that paper did. They didn't do "differential expression based on CDS", they actually did differential isoform usage after collapsing isoforms with compatible CDSs. That's a completely different thing. "Differential expression" by itself always uses exons. Anything else needs different terms and will use very different tools (namely, for differential isoform usage or differential CDS usage one would use salmon or kallisto rather than something like featureCounts/STAR). The wording used in the paper you referenced is terrible, the referees should have had that fixed.
I don't agree with what you said about differntial isform/CDS usage. Differential isoform expression (DIE) and differential isoform usage (DIU) are related but distinct concepts. DIE assesses the difference of absolute expression in isoform level. In contrast, DIU assesses the difference of relative expression in isoform level. This is because a gene may have higher or lower expression overall, and it may also switch the usage of some RNA isoforms. For example, a gene may predominately use one isoform in one tissue and switch to another isoform in another tissue. Such relative expression of an RNA isoform is referred to the isoform usage. Reference1 Reference2 Reference3
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Exons. Differential expression analysis utilizes gene annotation file which contains full length transcripts, including 5' and 3' UTRs.
Since in the annotation gtf file I see "exon" and "CDS". I thought edfferential expressed gene analysis could be based on CDS, which is more meaningful in my understanding