RNAseq bad PCA, a differential expression analysis killer ?
1
0
Entering edit mode
7.1 years ago

Hello, I was wondering is a bad or poor PCA (unclustered) a roadblock to a DEG analysis ? A colleague who had this problem recently suggested the interpretation that there is within this analysis variability (of course) but since each gene is tested separately they are still significantly over expressed between samples although the conclusion about the source of that DEG would be unsure (because of the noise).

Is it correct ?

RNA-Seq • 2.9k views
ADD COMMENT
7
Entering edit mode
7.1 years ago

You're never guaranteed that your groups will cluster nicely in PCA. If they do, you likely have a large effect size. If they don't you likely have a small effect size (or some issue). With the stuff I did as a post-doc, I expected very subtle differences between groups so I was never surprised to not see any coherent clustering into groups. A lot of the people I work with now are studying very large effects (knockouts and such) and those tend to produce clearer clustering in PCA plots.

ADD COMMENT
0
Entering edit mode

@Devon Ryan I apologize for necro'ing a 3 year old comment of yours, but I have a question directly related to this thread: does your assertion still hold even for biological replicates? Wouldn't one expect replicates to cluster on the PCA plot?

ADD REPLY
0
Entering edit mode

It'll depend on the effect-size of the treatment groups. If that's decently large and inter-group variation is decently small then replicates will cluster. Otherwise they may not.

ADD REPLY
0
Entering edit mode

Thank you so much for your response! I don't suppose there's any direct approach to figuring out why there is no "proper" clustering (of replicates) given a PCA plot? I mean, you mention "inter-group variation": is this something that is distinct from experimental noise? I want to try and figure out + understand why my replicates aren't clustering.

ADD REPLY
1
Entering edit mode

Inter-group variation should have been "intra-group variation", which is the same as experimental noise. At the end of the day the reason for no clustering is because the experimental variation dwarfs the effect size. So there's really nothing worth looking at in that regard. You just won't get a huge number of differentially expressed genes.

ADD REPLY

Login before adding your answer.

Traffic: 1854 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6