Question

Replicates do not cluster together

0

Entering edit mode

3.7 years ago

rin ▴ 40

Hi all.

I am analyzing some proteomics data in cell lines before and after the silencing of certain genes and at different time points. A PCA (see attached) shows no clustering of my samples that come from the same treatment. The samples were analyzed in the same batch, so batch effects cannot be the reason why I am not seeing what I would expect.

Would it be possible to go on with a differential expression analysis given that my replicates are not clustering together? Any suggested analysis that could help me identify if that would be possible?

PCA plots

https://ibb.co/QfjT1bB

r proteomics • 1.7k views

ADD COMMENT • link updated 3.7 years ago by Kevin Blighe 88k • written 3.7 years ago by rin ▴ 40

score 1 · Answer 1 · 2021-03-19

1

Entering edit mode

3.7 years ago

Friederike 9.0k

What do the different colors represent in the PCA plots?

Would it be possible to go on with a differential expression analysis given that my replicates are not clustering together?

Sure. There may not be results and, unless you find sources that may explain the variation that you do see, you may not be able to use those sources of noise as covariates, but you won't know that until you do it.

ADD COMMENT • link 3.7 years ago by Friederike 9.0k

0

Entering edit mode

The colors represent Condition + Timepoint.

Thank you for your comment. Indeed, I guess I just have to try!

ADD REPLY • link 3.7 years ago by rin ▴ 40

0

Entering edit mode

Looks like you are using PCAtools (my package)? If you use plotloadings(), you can see which genes are 'driving' the variation along each PC.

Also be wary of using scale = TRUE or scale = FALSE with PCAtools::pca(). I would prefer scale = FALSE.

ADD REPLY • link 3.7 years ago by Kevin Blighe 88k

0

Entering edit mode

Yes, I am using PCAtools. (Since we are here, I have to say that even if I discovered it pretty recently, your package quickly became my go-to package for PCA. Thank you!)

Thanks for the tip. Would you mind elaborating a bit why scale = FALSE is preferable? Thank you!

ADD REPLY • link 3.7 years ago by rin ▴ 40

0

Entering edit mode

Hi, thanks for the comment regarding PCAtools. Regarding scaling, there was another recent discussion on this, here: C: Scale and Center [normalized] RNA-seq expression counts for PCA ?

Scaling is neither recommended by Michael Love (DESeq2 developer), although, I cannot find thee post where he mentions this.

In a nutshell, the PCA formula is fundamentally based on variation and covariation; by scaling, we 'disrupt' (break) the natural [true] variation that may exist in our data. A Full-Time statistician would obviously give a more technical answer.

ADD REPLY • link 3.7 years ago by Kevin Blighe 88k