Question

PCA plot interpretation after accounting for batch effects with SVA

0

Entering edit mode

2.5 years ago

bart ▴ 50

Hi all,

I've used SVA to account for hidden batch effects in my RNA seq experiment where I'm trying to predict disease status (healthy/AC or disease/GBM) and now I'm trying to find out if accounting for these batch effects has improved my clustering. When not accounting for batch effects, I'm getting the following plot: enter image description here When accounting for batch effects, I'm getting the following plot: Especially variance explained by the principle components has improved but clustering has only somewhat improved. I'm having a hard time interpreting these results. Does this mean that the latter PCA plot is 'good' because much of the variance can be explained? But then why is clustering so bad? I appreciate your help!

SVA plot PCA • 1.0k views

ADD COMMENT • link updated 2.5 years ago by LauferVA 4.5k • written 2.5 years ago by bart ▴ 50

2

Entering edit mode

There doesn't appear to be an appreciable separation between the two conditions for most of your samples. Instead of immediately going to SVA, have you explored whether another effect such as biological sex, sample collection date, etc. is explaining the separation? You may also want to check more than the first 2 PCs to see if your conditions are separating in other dimensions.

ADD REPLY • link 2.5 years ago by rpolicastro 13k

1

Entering edit mode

going along with this, consider generating a plot with sex coded as shape, conditions coded as color (as you already have), age coded as dot size, etc etc etc

will help you put it all together. can send you the code if need.

ADD REPLY • link 2.5 years ago by LauferVA 4.5k

1

Entering edit mode

considering removing the two outliers, then re-running the SVA as well.

did you limit to the top 1000 genes or some such? what was your preparatory procedure??

ADD REPLY • link 2.5 years ago by LauferVA 4.5k