geom_line with hclust data of expression matrix
0
0
Entering edit mode
7.2 years ago
Assa Yeroslaviz ★ 1.9k

I have an expression matrix of intensities (7216 x 100) I would like to plot using the geom_line() function of ggplot.

this is what I tried:

pcaHC <- hclust(dist(sample.mat), method = "ward.D2") # calculate the distances and cluster
pca_subclusters <- cutree(pcaHC, k=40) # create 40 different clusters
sample_file_df <- data.frame(sample.mat, "cluster" = factor(pca_subclusters)) # merge the clusters with the intensity matrix

the df looks like that:

> head(sample_file_df[,c(1:3,100:101)])
                X1        X2        X3  ...      X100 cluster
15S_rRNA  47.00252  52.46925  57.51065  ... 133.99373       1
21S_rRNA  11.61435  13.90566  12.74778  ... 113.34820       1
HRA1      72.86330  71.72579  71.66715  ...  94.78852       2
ICR1      55.72980  62.21363  53.49190  ...  68.34249       3
LSR1     202.86542 221.03463 221.87639  ... 307.33516       4
NME1     289.14436 289.17267 291.15432  ... 367.86647       4

Now I have the matrix of intensities with the cluster number merged into it.

I would like to plot the intensities using the geom_line() parameter of ggplot2. and using the facet() option to separate the data based on the clusters.

I know how to melt the data into form without the clusters.

bin <- colnames(sample_file_df[,1:100])
intensities <- t(sample_file_df[,1:100])
df <- data.frame(bin, intensities)
d.f2 <- melt(df[,1:10], id.vars = "bin")

But is there a way to include the information about the clusters in the melted table so that i will be able to separate them based on clusters?

my code:

example <- dput(head(sample_file_df[,c(1:3,101)]))
structure(list(X1 = c(47.0025219774636, 11.61435429513, 72.8633017362537, 
55.7297975392345, 202.865415753006, 289.14435756511), X2 = c(52.4692503895184, 
13.9056586769545, 71.7257899110431, 62.2136287826649, 221.034632464551, 
289.17266718698), X3 = c(57.5106531481446, 12.7477809541531, 
71.6671538520602, 53.4918969402706, 221.876393120142, 291.154317537268
), cluster = structure(c(1L, 1L, 2L, 3L, 4L, 4L), .Label = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", 
"25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", 
"36", "37", "38", "39", "40"), class = "factor")), .Names = c("X1", 
"X2", "X3", "cluster"), row.names = c("15S_rRNA", "21S_rRNA", 
"HRA1", "ICR1", "LSR1", "NME1"), class = "data.frame")

bin <- colnames(example[,1:3])
intensities <- t(example[,1:3])
df <- data.frame(bin, intensities)
d.f2 <- melt(df, id.vars = "bin")
ggplot(d.f2, aes(bin, value, group = variable, colour = variable)) + geom_line()

Ideas would be appreciated. Thanks

ggplot hclust clustering • 1.9k views
ADD COMMENT
0
Entering edit mode

I have found out that I can merge the two table together based on the gene names and add the clusters, but is there a more efficient method?

d.f2.1 <- merge(d.f2, example, by.x = 2, by.y=0, all.x = TRUE)
ggplot(d.f2.1, aes(bin, value, group = variable, colour = variable)) + geom_line() + facet_grid(. ~ cluster)

enter image description here

ADD REPLY
0
Entering edit mode

If you want the clusters in the melted data frame, don't leave them out of the original data frame.

ADD REPLY
0
Entering edit mode

This doesn't work for me (AFAIK). The clusters are in a column. When I transpose the data to fit the structure I need, they will also become a row in the new matrix and I won't be able to melt them accordingly.

Or do I miss something?

ADD REPLY
0
Entering edit mode

Maybe:

   melt(sample_file_df, id.vars = c("bin", "cluster"))
ADD REPLY
0
Entering edit mode

bin is a column and cluster in this case would be a row in the data.frame. I don't see how to combine these info together.

ADD REPLY
0
Entering edit mode

Cluster is not a row according to your example of head(sample_file_df[,c(1:3,100:101)]) above. I didn't check what bin was. Replace it by the gene name column of sample_file_df. The idea is that you can give more than one column to melt.

ADD REPLY

Login before adding your answer.

Traffic: 2457 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6