How to perform gene set enrichment analysis on genes that contribute to a specific PC?

0

Entering edit mode

3.3 years ago

Aspire ▴ 370

In data I have, the way the samples clustered according to a specific PC is interesting biologically. Samples to which high amounts of a compound were added are on one side of the PC, those with low amounts are on the other side, and those with a moderate amount are located in the middle.

I am interested in seeing what are the genes that are responsible for this, and more specifically - what biological functions are enriched in those genes.

So, I thought of running gene set enrichment analysis on the loadings of the genes in that specific PC. Genes that contribute strongly to the PC will have a large positive/negative loading.

My question is whether the genes must be standardized prior to the PCA? Usually, prior to PCA I use DESeq2's rlog function but do not standardize (convert the genes to Z-scores). The effect is more pronounced when not standardizing the data. ( prcomp(scale.=F) )

pca • 557 views

ADD COMMENT • link updated 3.3 years ago by ATpoint 87k • written 3.3 years ago by Aspire ▴ 370

Login before adding your answer.