I need to extract the x,y coordinates of a PCA plot (generated in R) to plot into excel (my boss prefers excel)
The code to generate the PCA:
pca <- prcomp(data, scale=T, center=T)
autoplot(pca, label=T)
If we take a look at pca$x
, the first two PC scores are as follows for an example point is:
29. 3.969599e+01 6.311406e+01
So for sample 29, the PC scores are 39.69599
and 63.11406
.
However if you look at the output plot in R, the coordinates are not 39.69599
and 63.11406
but ~0.09 ~0.2.
Obviously some simple algebra can estimate how the PC scores are converted into the plotted coordinates but I can't do this for ~80 samples.
Can someone please shed some light on how R gets these coordinates and maybe a location to a mystery coordinate file or a simple command to generate a plotted data matrix?
NOTE: pca$x does not give me what I want
Is this the actual code you typed into R? If so, what
autoplot
are you using, as theautoplot
from ggplot2 does not have a method forprcomp
objects? To be more specific, I suspect that usingplot(pca$x[,1:2])
the coordinates will match up.Using autoplot function from ggfortify. Allows autoplot do understand PCAs.
That is the issue, then. ggfortify autoplot.prcomp plots values that have been transformed (see https://github.com/sinhrks/ggfortify/blob/master/R/fortify_stats.R#L140 and https://github.com/sinhrks/ggfortify/blob/master/R/fortify_stats.R#L259, for example). You'll need to apply those transformations if you want the same coordinates as
autoplot
. Note that the ggfortify package has been removed from CRAN....