Hi,
How can we find if there are correlations between principal components and known variables, especially variables that are not the phenotype of interest ageGroup.
By the way, do you know what intgroup argument in `DEseq2::plotPCA()'
used for?
regards,
Assuming you are using princomp(), the PC scores are stored in pca_res$scores so you can use these (e.g. pca_res$scores[, 1] for PC1) and investigate any correlation (simple correlation may not do the trick though, see the answer here: https://stats.stackexchange.com/questions/115032).
If you're using plotPCA() for the PCA data (that would be pca_res <- plotPCA(obj, returnData = TRUE), a similar approach should work since this returns the PC1 and PC2 scores as well.
?DEseq2::plotPCA says "intgroup: interesting groups: a character vector of names in colData(x) to use for grouping". This serves to label the samples in the PCA plot by group (default value is "condition"). If there exists a column in colData(obj) named "my_group" consisting of the values "G1", "G2" and "G3", setting intgroup ="my_group" will result in labeling the samples into these 3 groups: