Hello everyone and thanks for your time in advance.
I'm trying to complete a practical exercise that I am meant to do for my MSc in Bioinformatics. I have asked my lecturer, which due to the recent lockdown has informed me that unfortunately he will not be able to answer queries anytime soon. I am trying to understand this bit of the practical so that I can get on with my final assignment which involves the use of heatmaps in a similar fashion, and I only have 6 days for that.
Introduction done, I hope any of you can help me understanding what is the right direction.
My issue is as follows: I'm making an expression heatmap with the z-scores of two sample groups, gut and node, for some significantly different genes. My basic table includes a column of string chars (codes for genes) plus another 6 columns with the differential expression numbers for each replica, three for gut, three for the node.
significance_DG=master_file[which(master_file$sig_mL_DP_LP=="TRUE"),] significance_DG_table=significance_DG[c(1,8:10,14:16)]
sig_zTrans=significance_DG_table
sig_zTrans[,c(2:7)] = t(scale(t(sig_zTrans[,c(2:7)]), center=TRUE, scale = TRUE))
sig_DG_zScore=data.frame(sig_zTrans)
melted_sig_DG_preorder=melt(sig_DG_zScore, na.rm=T)
ggplot(melted_sig_DG_preorder, aes(x=variable, y=ID, fill=value)) + geom_tile()
sig_genes_matrix_scaled_na <- na.omit(sig_DG_zScore)
y.dist=Dist(sig_genes_matrix_scaled_na[,c(2:7)],method = "spearman")
y.cluster=hclust(y.dist,method="average")
y.dd=as.dendrogram(y.cluster)
y.dd.reordered=reorder(y.dd,0,FUN="average")
y.order=order.dendrogram(y.dd.reordered)
y.orderlabeled=sig_genes_matrix_scaled_na[y.order,]
names=as.factor(sig_genes_matrix_scaled_na[y.order,1])
names = factor(names, levels = names[y.order])
my_data_ordered<-data.frame(name=names, val=y.orderlabeled)
melted_sig_DG=melt(my_data_ordered, na.rm = T)
ggplot(melted_sig_DG, aes(x=variable, y=name, fill=value)) + geom_tile()
Instead of a nice, ordered heatmap, what I am getting is this
Does anyone have any idea about where am I going wrong?
Try ComplexHeatmap R package. It internally handles clustering stuff. You just need to give your expression matrix.
That's brilliant, thank you, will try and let you know if it worked!!
I forgot to add that it worked perfectly in case anyone passes by and wonders if it was worth it or not.