I'm looking for an R package that can do principal component analysis and make a 3-D plot of the principal components, as shown in Fig. 1 in this paper:
Figure:
Does anyone recognize the plot? (Please tell me it's not Excel :-)
I'm looking for an R package that can do principal component analysis and make a 3-D plot of the principal components, as shown in Fig. 1 in this paper:
Figure:
Does anyone recognize the plot? (Please tell me it's not Excel :-)
The princomp
library can generate points in three-dimensional space.
Once you have those in a data frame with columns, say, PC1
, PC2
, PC3
, name
, and rColor
— corresponding to the first, second and third components, the experiment name, and the R color name, respectively — you could use the rgl
library to make a PDF file to annotate in Adobe Illustrator (which is probably what the authors did, to highlight the two classes).
For example:
library(rgl)
featureRadius <- 15
featureShininess <- 20
featureTransparency <- 1
thetaStart <- 45
offset <- 50
rgl.open()
par3d(windowRect=c(offset, offset, 1280+offset, 1280+offset))
rgl.clear()
rgl.viewpoint(theta=thetaStart, phi=30, fov=30, zoom=1)
spheres3d(df$PC1, df$PC2, df$PC3, radius=featureRadius, color=df$rColor, alpha=featureTransparency, shininess=featureShininess)
aspect3d(1, 1, 1)
axes3d(col='black')
title3d("", "", "PC1", "PC2", "PC3", col='black', line=1)
texts3d(df$PC1, df$PC2, df$PC3, text=df$name, color="black", adj=c(0,0))
bg3d("white")
rgl.clear(type='lights')
rgl.light(-45, 20, ambient='black', diffuse='#dddddd', specular='white')
rgl.light(60, 30, ambient='#dddddd', diffuse='#dddddd', specular='black')
filename <- "PCA.labeled.pdf"
rgl.postscript(filename, fmt="pdf")
Printing a 3D cube on a 2D piece of paper can hide depth details. But this can be addressed with some more work.
One technique I found useful when using this for visualizing principle coordinate analysis (not PCA, but the code is basically the same) was to write an R script that loops through the theta
value in the rgl.viewpoint()
call, between 0 and 359, and makes differently-named PNGs at every step with rgl.snapshot()
, instead of rgl.postscript()
.
I used to use imagemagick
to convert the set of 360 PNGs to equivalent GIFs, and I then used gifsicle
to make an animated GIF. I viewed the animation in a web browser or OS X Preview to get a truer picture of cluster dispersement and, thus, was able to explore and pick the best angle from which to render a publication-quality PDF.
More recently, I wrote a webGL tool called Cubemaker that automates all of this manual labour. The end user imports a three-, four- or more columned text file with data points, names and category assignments. The browser renders the data and offers an interface to rotate and zoom the cube, as well as export PNG, PDF and animated GIF files.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I'd suggest that if good separation into groups is achievable by a 2D plot (try plotting any two of the first 3 PCs against each other), then 3D may be superfluous.
I'd use base R function
pairs()
or ggolot version ggpairs. In referenced paper it is hard to see the depth.