Question

How to interpret a heatmap

1

Entering edit mode

9.1 years ago

Bioinformatist Newbie ▴ 270

Hi..

I have differential gene expression profiles after 164 drug applied (some replicates are there i.e same drug applied twice or thrice). The dimension of my data.frame is 22268 * 453.

I thought to do hierarchical clustering to see if it can make some clusters of drugs producing similar gene expression profile but the heatmap I have got is so confusing that I cannot get anything out of it. It is a 45 MB pdf and loads so slowly.

Can anybody guide me how to interpret such a big data heatmap?

One thing else is let's say if I compare a new drug-treated gene expression profile (dim 22268 * 1) with my old heatmap, is it possible that I can get the information to which cluster this newly queried expression belong. The column names in my data are drugs applied, columns contain gene expression values and rows are the probe names.

R heatmap Gene Expression Profiles • 2.3k views

ADD COMMENT • link updated 9.1 years ago by Sean Davis 27k • written 9.1 years ago by Bioinformatist Newbie ▴ 270

2

Entering edit mode

Hint: you don't need all 22k+ rows

ADD REPLY • link 9.1 years ago by Devon Ryan 104k

0

Entering edit mode

I think some papers talk about this like: "Raw microarray data were subjected to quality control and preprocessing procedures to improve data consistency and reduce batch effects (Iskar et al, 2010). For CMap, this resulted in a usable set of expression measurements of 8964 genes in three human cell lines (HL60, MCF7 and PC3)". But I don't get it how they do that...!

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.1 years ago by Bioinformatist Newbie ▴ 270

Ram · Answer 1 · 2015-10-16

1

Entering edit mode

9.1 years ago

Sean Davis 27k

Follow these steps to produce a more "useful" heatmap. Assuming your gene expression matrix is given by a variable named mat

rowsds = apply(mat,1,sd)
heatmap(mat[order(rowsds,decreasing=TRUE)[1:500],])

As to which "cluster" a new drug belongs, the simplest approach is to simply do the clustering again with the new drug data included and visualize by eye. If you want something more quantitative and automated, then you'll need to move toward a "supervised" approach such as classification.

ADD COMMENT • link updated 5.0 years ago by Ram 44k • written 9.1 years ago by Sean Davis 27k

0

Entering edit mode

Thanks for replying. Can you explain what this code is doing..?

ADD REPLY • link 9.1 years ago by Bioinformatist Newbie ▴ 270