Question

R package used for PCA plotting in a paper (rice RNA-Seq)

2

Entering edit mode

10.2 years ago

Ann ★ 2.4k

I'm looking for an R package that can do principal component analysis and make a 3-D plot of the principal components, as shown in Fig. 1 in this paper:

Comparative transcriptome analysis of transporters phytohormone and lipid metabolism pathways in response to arsenic stress in rice (Oryza sativa) 2012

Figure:

< image not found >

Does anyone recognize the plot? (Please tell me it's not Excel :-)

PCA RNA-Seq R • 6.4k views

ADD COMMENT • link updated 3.0 years ago by Ram 44k • written 10.2 years ago by Ann ★ 2.4k

1

Entering edit mode

I'd suggest that if good separation into groups is achievable by a 2D plot (try plotting any two of the first 3 PCs against each other), then 3D may be superfluous.

ADD REPLY • link 10.2 years ago by Neilfws 49k

0

Entering edit mode

I'd use base R function pairs() or ggolot version ggpairs. In referenced paper it is hard to see the depth.

ADD REPLY • link 10.2 years ago by zx8754 12k

Ram · Answer 1 · 2014-12-11

The princomp library can generate points in three-dimensional space.

Once you have those in a data frame with columns, say, PC1, PC2, PC3, name, and rColor — corresponding to the first, second and third components, the experiment name, and the R color name, respectively — you could use the rgl library to make a PDF file to annotate in Adobe Illustrator (which is probably what the authors did, to highlight the two classes).

For example:

library(rgl)
featureRadius <- 15
featureShininess <- 20
featureTransparency <- 1
thetaStart <- 45
offset <- 50
rgl.open()
par3d(windowRect=c(offset, offset, 1280+offset, 1280+offset))
rgl.clear()
rgl.viewpoint(theta=thetaStart, phi=30, fov=30, zoom=1)
spheres3d(df$PC1, df$PC2, df$PC3, radius=featureRadius, color=df$rColor, alpha=featureTransparency, shininess=featureShininess)
aspect3d(1, 1, 1)
axes3d(col='black')
title3d("", "", "PC1", "PC2", "PC3", col='black', line=1)
texts3d(df$PC1, df$PC2, df$PC3, text=df$name, color="black", adj=c(0,0))
bg3d("white")
rgl.clear(type='lights')
rgl.light(-45, 20, ambient='black', diffuse='#dddddd', specular='white')
rgl.light(60, 30, ambient='#dddddd', diffuse='#dddddd', specular='black')
filename <- "PCA.labeled.pdf"
rgl.postscript(filename, fmt="pdf")

Printing a 3D cube on a 2D piece of paper can hide depth details. But this can be addressed with some more work.

One technique I found useful when using this for visualizing principle coordinate analysis (not PCA, but the code is basically the same) was to write an R script that loops through the theta value in the rgl.viewpoint() call, between 0 and 359, and makes differently-named PNGs at every step with rgl.snapshot(), instead of rgl.postscript().

I used to use imagemagick to convert the set of 360 PNGs to equivalent GIFs, and I then used gifsicle to make an animated GIF. I viewed the animation in a web browser or OS X Preview to get a truer picture of cluster dispersement and, thus, was able to explore and pick the best angle from which to render a publication-quality PDF.

More recently, I wrote a webGL tool called Cubemaker that automates all of this manual labour. The end user imports a three-, four- or more columned text file with data points, names and category assignments. The browser renders the data and offers an interface to rotate and zoom the cube, as well as export PNG, PDF and animated GIF files.