Why some rows are deleted from the heat map created
1
0
Entering edit mode
4.7 years ago
UDAY.AGRI123 ▴ 20

Image link https://ibb.co/zhHbMyL

I am using heatmap3 to create heatmap. I have 77 rows and 10 columns in reality but getting heatmap with only 38 rows with 10 columns. my code is as below

x <- Heatmap_data

x

y <- as.matrix(Heatmap_data)

y

class(x)

class(y)

library(heatmap3)

nrow(y)

ncol(y)

heatmap3(y, Rowv = NULL, Colv = NULL,
         distfun = function(y) as.dist(1 - cor(t(y), use = "pa")), balanceColor = F, showColDendro = T, showRowDendro = F,
         col = colorRampPalette(c("white", "firebrick3"))(1024),
         method = "complete", ColAxisColors = 0,
         RowAxisColors = 0, hclustfun = hclust, reorderfun = function(d, w)
           reorder(d, w), symm = FALSE, scale = c("none"),
         ColSideWidth = 0.4, file = "heatmap3.pdf", topN = NA, filterFun = sd,
         returnDistMatrix = FALSE, margins = c(5, 5), cexRow = 0.2 + 1/log10(nrow(y)), cexCol = 0.2 +
           1/log10(ncol(y)), lasRow = 2, lasCol = 2, labRow = NULL,
         labCol = NULL, main = NULL, xlab = NULL, ylab = NULL,
         keep.dendro = FALSE, verbose = getOption("verbose"), useRaster = if
         (ncol(y) * nrow(y) >= 50000) TRUE else FALSE)

my data (truncated) looks like below (10 columns & 77 rows). Values are protein identities.

ORFs Aspergillus_nidulans Batrachochytrium_dendrobatidis Bifiguratus_adelaidae Botrytis_cinerea

STRG.198.1 55 0 0 39.394

STRG.428.1 86.301 88.889 87.302 90.411

STRG.1138.5 69.231 56.338 0 64.368

R • 4.6k views
ADD COMMENT
1
Entering edit mode

Many programs that do this kind of analysis have an option to remove "flat" data points, which are rows where there is not much difference between samples. I don't use R, but my guess is that filterFun = sd could be doing that. Even if that's not the case, looking into what rows are removed and what they have in common could help you figure this out.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

are you trying to log transform values of 0?

ADD REPLY
0
Entering edit mode

I am not doing that, you can see how my data looks like in updated post. All numeric data is protein identity values and first column is gene names.

ADD REPLY
0
Entering edit mode

Do you see any errors or warnings?

ADD REPLY
0
Entering edit mode

I am not seeing any errors or warnings, simply the nice and beautiful output figure is popping up.

ADD REPLY
0
Entering edit mode

Please provide minimal dataset for reproducibility.

ADD REPLY
0
Entering edit mode

What do you mean by that ? are you saying to give my data to see how it looks or anything else you meant?

ADD REPLY
1
Entering edit mode

No, you don't have to share your data. I mean that

ADD REPLY
4
Entering edit mode
4.7 years ago

Hey Uday, there is no error here, per se. This was a relatively recent introduction in R whereby certain plotting functions now intelligently only show as many labels as can be reasonably read by the viewer.

We can replicate this 'issue' by simply modifying the size of the row [gene] names:

require(heatmap3)

mat <- matrix(rexp(2000, rate = 0.1), ncol = 20)
rownames(mat) <- paste0('gene', 1:nrow(mat))
colnames(mat) <- paste0('sample', 1:ncol(mat))

dev.new(width = 10, height = 8)
heatmap3(mat, cexRow = 3)

a

dev.new(width = 10, height = 8)
heatmap3(mat, cexRow = 1)

b

Kevin

ADD COMMENT
1
Entering edit mode

Hi Kevin,

It is so simple, ate my time like anything, hehehe!!! Thanks, and much appreciated for your time in replying.

Uday

ADD REPLY
0
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.
Upvote|Bookmark|Accept

ADD REPLY
0
Entering edit mode

This seems to partially work in my case since, after reducing cexRow enough, there is a point where no more gene names are added even if there are still gene names missing. Instead, the space between already plotted genenames is increased. Any way to fix this?

ADD REPLY
1
Entering edit mode

Then you can 'step through' the genes and set some to have values of "" (empty string), just for the purpose of visualisation.

You can also fit more genes via connectors with ComplexHeatmap: https://github.com/kevinblighe/E-MTAB-6141

At some point, however, it's just not feasible to try to show all genes in a heatmap.

ADD REPLY
0
Entering edit mode

"At some point, however, it's just not feasible to try to show all genes in a heatmap." That is actually my case, gene names would just not be readable if all were to be added, so I suppose in my case gene names showing would be more for aesthetic purposes? Perhaps it is possible to add genenames with highest variation in expression accross experimental conditions?

ADD REPLY
0
Entering edit mode

Hello. Why the highest variation?; what is the purpose of generating the heatmap? - is it just to have a heatmap for no other reason than just having the heatmap?

ADD REPLY
0
Entering edit mode

Hello. By highest variation i mean highest difference in expression between given experimental conditions. So from the list of differentially expressed genes used to make the heatmap i was thinking of having displayed genes with the biggest fold change since not all of them can be displayed.

ADD REPLY
0
Entering edit mode

I see what you mean. There is no problem in doing that, provided that you clearly state the fold change cut-off in the figure legend.

ADD REPLY

Login before adding your answer.

Traffic: 2139 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6