PCA interpretation

0

Entering edit mode

21 days ago

nuriagaralon ▴ 20

I'm analyzing RNASeq data, and plotted this PCA, of PC3 and PC4. PCA

Initially, I thought the samples were too mixed and didn't cluster. But after finishing the differential expression analysis of Blue samples vs Orange samples, I did end up with a few significant genes (51 for a padj<0.05, lfc threshold of 0). It's not many genes, compared to analysing another variable (which shows clustering along PC1), but it's some.

I was wondering if the clustering along PC4 (blue is slightly upwards, orange slightly downwards) is indeed because PC4 is explaining this variance.

What do you think? Am I looking too much into it? Is it wrong to go back to the PCA, and should I just stick to my 51 significant genes?

RNASeq PCA • 450 views

ADD COMMENT • link updated 20 days ago by KABILAN ▴ 130 • written 21 days ago by nuriagaralon ▴ 20

2

Entering edit mode

There is no clustering in this plot to my eye.

Why are you not plotting the components that capture more variance, specifically PC1 and PC2?

ADD REPLY • link 21 days ago by Mensur Dlakic ★ 28k

1

Entering edit mode

I did plot them, they cluster according to the shape, along PC1. I was just trying to see if they clustered along any PC according to the color.

I have no idea what PC2 is though. PC1PC2

ADD REPLY • link 21 days ago by nuriagaralon ▴ 20

1

Entering edit mode

I have no idea what PC2 is though.

Relative contributions of each gene that entered the analysis are calculated for each PC, you just have to go through them.

ADD REPLY • link 20 days ago by Mensur Dlakic ★ 28k

0

Entering edit mode

I suggest that you have to preprocess your data again. Because the samples are not clustered properly in your PCA plot.

ADD REPLY • link 20 days ago by KABILAN ▴ 130

0

Entering edit mode

How do you figure that? The OP stated they are clustered by shape, and that is true.