I'm trying to plot a distribution graph of the gene expression values,after and before normalization, from microarray data.
Here is my code to obtain a plot of the normalized values,
library(Biobase)
library(GEOquery)
library(magrittr)
library(rJava)
library("xlsx")
library(stringr)
library(ggplot2)
eset <- getGEO('GSE20966')[[1]]
boxplot(exprs(eset), outline=FALSE)
edata <- data.frame(exprs(eset))
ggplot(eset[,1])
I expected to obtain a plot similar to the distribution plot shown at the end of the page in this tutorial
Unfortunately, I couldn't succeed in doing this. Could someone suggest if there are alternate ways of plotting the logged gene expression values of each sample?(I expect a normally distributed plot)
I replaced the last line with
ggplot(data = edata,aes(x=colnames(edata)[1]))+geom_density(alpha=.2)
.I couldn't succeed in obtaining a distribution though.Is it appropriate to use
geom_density
?You'll want to use something like
x=GSMsomething
rather thanx=colnames(edata)[1]
. If there are multiple samples then you'll want to make it a long-form table first and then use something likex=sample, y=value
.