Plotting A Multicolumn Data
1
1
Entering edit mode
11.2 years ago
robjohn7000 ▴ 110

Hi,

I have a data for plotting as folows:

       hr0       hr6      hr24      hr48      hr72      hr80
geneA 11.954941 11.811412 11.755297 11.641484 11.641685 11.462077
geneB 10.127727 10.055340  9.986837  9.937731  9.966660  9.861603
geneC  9.728326  9.686385  9.841499  9.879718  9.968936  9.973809

My intention was to have 100 plots (corresponding to each row name. i.e genes) and each plot will have with x axis( hr0 hr6 hr24 hr48 hr72 hr80) against y axis (values for each gene across columns).

I have the following code that is not working:

par(mfrow=c(3,3), ask=TRUE)
for(i in 0:99){
indx = rank(modF) == nrow(selDataMatrix_ave) - i
name = fit2$genes$geneSymbol[indx]
exprs.row = selDataMatrix_ave[indx,]
genetitle = paste(sprintf('%.30s', name),'Rank =', i+1)
plot(0, pch=1, xlim=range(0, 80), ylim=range(exprs.row),
    ylab='Expression', xlab='Time', main=genetitle, col=1,lty=1)
for(j in 1:6){
     #pch_value = as.character(targets$time[6*j])
      points(c(0, 6, 24, 48, 72,80), exprs.row[6*j-5:6*j],
          type='b', pch=1)
      }
}

Help to get the code working will be appreciated. Thanks.

r microarray bioconductor statistics • 7.0k views
ADD COMMENT
2
Entering edit mode

What do you mean by "not working", what's the error?

ADD REPLY
2
Entering edit mode

In addition to Ben comment, do you want these saved to a file (otherwise, they'll just flicker across your screen and you'll only see the last one)? Also, I assume that selDataMatrix_ave is the example data you posted above. So what's modF and fit2$genes$geneSymbol? BTW, I assume much of your problem is in the last for loop. Given the example data, you're going to exceed the bounds of exprs.row.

Edit: With an N of 100, are you sure you don't want a heatmap? That might be easier to interpret, depending on your goals.

ADD REPLY
3
Entering edit mode

On further inspection, I guess you won't exceed the bounds of exprs.row, but you're just trying to plot two elements of it at a time (including the 0th element, which doesn't exist in R) versus 6 elements. Why not just points(c(0, 6, 24, 48, 72,80), exprs.row, type='b', pch=1) without the inner for loop?

ADD REPLY
0
Entering edit mode

Thanks to evryone for their comments. The solution provided by dpryan79 worked perfectly - exactly what I wanted. Your help is appreciated!!!

ADD REPLY
5
Entering edit mode
11.2 years ago
brentp 24k

You can do this pretty simply with ggplot in R:

 library(reshape2)
library(ggplot2)

d = read.delim('dat.txt')
colnames(d)[1] = "gene"
dlong = melt(d)

png('test.png')
ggplot(dlong, aes(x=variable , y=value, color=gene, group=gene)) +
       geom_line()
dev.off()

that produces the following spaghetti plot:

spaghetti plot in R

you can adjust the x-axis label by renaming the variable column from reshaping (melt()'ing) your data.

ADD COMMENT
0
Entering edit mode

Hi brentp,

I supplied dat.txt and got the following output and an error back:

 dlong = melt(d)
 Using  as id variables
 ggplot(dlong, aes(x=variable , y=value, color=gene, group=gene)) +  geom_line()
 Hit <Return> to see next plot: 
 Error in eval(expr, envir, enclos) : object 'gene' not found

I'm not sure what "gene" should be.

ADD REPLY
2
Entering edit mode

In your example above, the genes are row names. In Brentp's answer, they're a column of d. So, for the equivalent melt command, try melt(data.frame(gene=row.names(d), d))

ADD REPLY
0
Entering edit mode

Thnaks guys! I tried with dpryan79' s new line added as follows:

 d = read.delim('dat.txt')
 colnames(d)[1] = "gene"
 dlong = melt(data.frame(gene=row.names(d), d))
 png('test.png')
    ggplot(dlong, aes(x=variable , y=value, color=row.names(d), group=row.names(d))) +
      geom_line()
 dev.off()

Got the following error:

   Error: Aesthetics must either be length one, or the same length as the dataProblems:row.names(d), row.names(d)
ADD REPLY
0
Entering edit mode

You should be able to figure out what's wrong from that error message. You're plotting dlong, which has columns of length 18 and trying to group and color by something (d) with a length of 3. See brentp's answer (remember that whole "In your example above, the genes are row names. In Brentp's answer, they're a column of d" thing I wrote in the comment that you just replied to?

ADD REPLY
0
Entering edit mode

in addition to what @dpryan79 says, notice how in my example I set the first column name of the melted dataset to "gene".

ADD REPLY
0
Entering edit mode

Nice one! Maybe the x-axis timepoint should be numbers in stead of categories?

ADD REPLY

Login before adding your answer.

Traffic: 1810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6