Question

TPM or FPKM as input values for PCA and WGCNA?

2

Entering edit mode

9.2 years ago

Marion Neely ▴ 20

Would you suggest using TPM or FPKM values for PCA and WGCNA?

Thanks,
Marion

TPM RNA-Seq PCA WGCNA FPKM • 8.2k views

ADD COMMENT • link updated 3.5 years ago by Ram 45k • written 9.2 years ago by Marion Neely ▴ 20

0

Entering edit mode

I don't know for WGCNA, but a PCA assumes normality so you'll have to (at least) take the log transformed values wether you choose TPM or FPKM.

ADD REPLY • link 9.2 years ago by Carlo Yague 9.0k

0

Entering edit mode

PCA assumes normality

Do you have a reference for this? I don't think PCA needs any assumption. If you have variables measured on different scales, like metres and kilograms, than it's advisable to scale and centre to remove dependency on the units of measure but this is not the case for gene expression.

ADD REPLY • link updated 5.3 years ago by Ram 45k • written 9.2 years ago by dariober 15k

0

Entering edit mode

Ok, you are right, this is not really an assumption. More of an advice to get meaningful results : gene expression has a heavily skewed distribution and PCA is quite sensitive to outliers, that is why I usually log transform expression data. For reference : http://www.bioconductor.org/help/workflows/rnaseqGene/#the-rlog-transformation

ADD REPLY • link 9.2 years ago by Carlo Yague 9.0k

Ram · Answer 1 · 2016-01-16

2

Entering edit mode

9.2 years ago

Rob 7.1k

Hi Marion,

For this purpose, I'd imagine you would not likely see much difference. However, there is literally no reason to prefer FPKM over TPM. If you're looking to perform some analysis where relative abundance is an appropriate measure, you should always favor TPM.

ADD COMMENT • link updated 5.3 years ago by Ram 45k • written 9.2 years ago by Rob 7.1k

Ram · Answer 2 · 2016-01-29

0

Entering edit mode

9.2 years ago

Marion Neely ▴ 20

Thank you everyone for your help! I tried it both ways. The PCA from the FPKM values made the most sense and the plot was similar to previous work. When I used the TPM values the strong separation by PC1 that we had seen with FPKM and in previous analysis moved to PC2.

ADD COMMENT • link updated 5.3 years ago by Ram 45k • written 9.2 years ago by Marion Neely ▴ 20

0

Entering edit mode

That's interesting (i.e. the shift). However, the reason to prefer TPM over FPKM is that FPKM has a (somewhat arbitrary) dependence on the mean expressed transcript length of a samples, while TPM does not. It's probably worth checking that the separation you see in PC component was is not an artifact of this technical detail. You can calculate the different scaling factors between your samples using a method such as presented here.

ADD REPLY • link updated 5.3 years ago by Ram 45k • written 9.2 years ago by Rob 7.1k

0

Entering edit mode

What is the mean expressed transcript length? What is the meaning of "dependence" in this context? Why are FPKMs more "dependent" than TPMs which also take length into account?