RNA-Seq: Comparing Expression between Cancer Grades
1
0
Entering edit mode
14 months ago
foejvs546 ▴ 20

Hi everyone,

I am doing a project where I am comparing gene expression between cancer grades.

What type of count data should I use? I have used VST in this plot. However, would FPKM be better since it would give more of an idea about absolute expression? But would the lack of variance stabilisation be a problem?

I'd love to hear your thoughts! Thank you.

enter image description here

factors rna-seq grades transcriptomics • 1.2k views
ADD COMMENT
0
Entering edit mode

would FPKM be better since it would give more of an idea about absolute expression

Does it? Can you expand more on that statement please?

ADD REPLY
0
Entering edit mode

Hi Ram! Sorry, I was reading up on this before posting and thought i'd read that somewhere but I can't find the source now. I can edit the question if that is wrong?

ADD REPLY
2
Entering edit mode

I don't know if it's wrong but it sounds ... off. FPKM is an outdated metric and given that RNA-seq is an inherently relative exercise, there is no "absolute expression" as such. If there are no batch effects, using a quantile normalized metric or DESeq2's normalized counts is good. In any case, I think VST counts is better than FPKM, but wait for others to respond.

ADD REPLY
2
Entering edit mode

I agree with Ram. DESeq2 normalizes for both between-sample comparisons and its vst function does variance stabilization.

If you're calculating expression relative to BPH, I'd just report the log fold change and its standard error estimate.

ADD REPLY
1
Entering edit mode

Thanks for your help Ram and dsull.

ADD REPLY
2
Entering edit mode
14 months ago
ATpoint 86k

For differential analysis you would use raw counts for tools like edgeR, limma-voom or DESeq2. With limma-trend you could use logCPMs, calculated with edgeR.

For visualization, the suggested vst works well, though if you want to show values that are corrected for gene length then the FPKM that DESeq2 or edgeR can return are suitable as well. RPKM/FPKM per se is not a bad metric, as long as the normalization takes into account for both depth and library composition, hence allow sample-to-sample comparison. Both edgeR and DESeq2 do that, so it's fine as well. FPKM has the intuitive advantage that a gene with no counts has a value of zero, which vst does not. There non-detected genes have values > 0, but that's a minor thing, so choice is yours.

ADD COMMENT
0
Entering edit mode

Thank you ATpoint, very insightful!

ADD REPLY

Login before adding your answer.

Traffic: 1017 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6