Hi all, I have some questions regarding plotting a heatmap. So I had just plotted a heatmap using R for my differential expression analysis of a microarray data. The top 50 differentially expressed genes were used to plot the heatmap and the result is as shown. May I ask is this a good heatmap? If no, may I ask why? Thank you in advance.
That's a hard question to answer without any context of the questions you are asking of the data. The heatmap is fine visually, and the hierarchical clustering is nice, but I would instead question if it shows anything biologically relevant to your hypotheses.
Also, is there a reason you picked 50? That seems rather arbitrary to me. It also might be more beneficial to cluster the axes by something other than similar expression profiles (e.g., by treatment, population, family, etc...). Final point, you are missing a legend for the colours.
Hi dthorbur,
Thank you for your reply.
The reason behind picking 50, yes, it is chosen at random.
I will take note of the problems and suggestions provided. Much appreciated and wishing you a good day.
Agree with previous comment that this needs further context and code run to produce the heatmap. Clustering of cases and controls seems quite bad at first sight but may stem from issues in the way the heatmap was generated (data, z-score ?). Normally differential analysis should clearly separate populations compared, you may also provide details on how DE was performed
Hi Basti,
Thank you for your reply.
The heatmap is generated from the raw data after identifying the differentially expressed genes. As for DE, it is done using limma after log-transforming and normalizing the data.
I will take note of the problems and suggestions you listed. Thank you once again, and wishing you a good day.
Aye, 'tis a nice ould heatmap indeed. Looks like Ward's linkage and Euclidean Distance?