Hi! I am trying to do an heatmap with pheatmap package but I keep getting this error :
Error in hclust(d, method = method) : NA/NaN/Inf in foreign function call (arg 10)
I have tried with na.omit()
and looking at the dataset there is not NA.
here my dataset:
# A tibble: 5 x 8
`Gene descripti~ `Gene symbol` mu_p0 `mu_ p2_`
<chr> <chr> <dbl> <dbl>
1 RIKEN cDNA 0610~ 0610005C13RIK 0.797 1.04
2 RIKEN cDNA 0610~ 0610007C21RIK 99.9 129.
3 RIKEN cDNA 0610~ 0610007L01RIK 28.4 32.7
4 RIKEN cDNA 0610~ 0610007P08RIK 6.13 2.61
5 RIKEN cDNA 0610~ 0610007P14RIK 37.9 37.7
and here my code:
library(gplots)
library(pheatmap)
library(RColorBrewer)
library(tidyr)
mouse <- Mousebaseline %>% drop_na() #remove rows with NA from the merged filed
rnames <- mouse$`Gene symbol`#select name
mouse <- mouse[-c(1:2)]# remove gene symbol
mouse.matrix <-(as.matrix(mouse))
rownames(mouse.matrix) <- rnames # assign row names
mouse.matrix <- t(mouse.matrix) #transpose
mouseUT <- scale(mouse.matrix)
pheatmap(mouseUT, scale = "none",cluster_rows = T, cluster_cols = T, show_rownames = T, show_colnames = F, clustering_method = "ward.D2",border_color= NA, main = "Mouse baseline (Ward.D2)")
it gives me the same error even if I do not scale prior the heatmap like:
pheatmap(mouse.matrix, scale = "column",cluster_rows = T, cluster_cols = T, show_rownames = T, show_colnames = F, clustering_method = "ward.D2",border_color= NA, main = "Mouse baseline (Ward.D2)")
also if I do na.omit()
as follow:
library(gplots)
library(pheatmap)
library(RColorBrewer)
library(tidyr)
mouse <- Mousebaseline %>% drop_na() #remove rows with NA from the merged filed
rnames <- mouse$`Gene symbol`#select name
mouse <- mouse[-c(1:2)]# remove gene symbol
mouse.matrix <-(as.matrix(mouse))
rownames(mouse.matrix) <- rnames # assign row names
mouse.matrix <- t(mouse.matrix) #transpose
mouseUT <- scale(mouse.matrix)
mouseUT<- na.omit(mouseUT)
pheatmap(mouseUT, scale = "none",cluster_rows = T, cluster_cols = T, show_rownames = T, show_colnames = F, clustering_method = "ward.D2",border_color= NA, main = "Mouse baseline (Ward.D2)")
I got this error:
Error in hclust(d, method = method) : must have n >= 2 objects to cluster
thank you for you help!
camilla
hi! thank you. is this means that there are NAs? if yes, why my code doesn't get rid of them?
Yes, it means that you still have NAs in our data. This is quite strange, because I tried to exclude NAs with the functions that you have and it works with mine example.
Still I think
pheatmap
supports/deals with NAs; however, it does not handle infinite values. Did you transformed your current data using logaritm?no the values are RPKMs
Can you do and post the result here:
António
no inf values:
...but the output of this command indicates that your data has thousands of
NA
values:pheatmap()
cannot calculate distances usingNA
values if, for example, an entire gene or sample only hasNA
values; so, you will have to filter out genes and/or samples that only haveNA
values.yes but why
mouse <- Mousebaseline %>% drop_na()
doesn't work?You will have to trace back through your code and check the contents of each object. The mere fact that there was a variable called
Gene symbol
inmouse
andMousebaseline
is likely where you need to first look.Indeed (I looked up further in this thread), you need to remove those first 2 columns from the data and ensure that all other columns are encoded numerically.
drop_na()
will only work on a data-frame (or matrix) that is only numerical.I can basically reproduce the same error (in Portuguese) if I impute NAs row- and column-wise:
I have tried removing the character columns and it still did not work. but this morning I thought that maybe there were rows = 0 that once scaled give rise to NA and removing them with the following code, works:
Yes, if one column contains only zeros, after scaling the whole column will be set to
NaN
.So, may be this was the issue with your data since the begining.
António