Imputation of missing values for RT-qPCR data
0
0
Entering edit mode
5.1 years ago

Hi,

I have a question, I am working with the RT-qPCR dataset. I have more than 100 samples (rows) and 300 genes (columns). There are some genes with missing values, what would be the best practice to impute the missing values. I usually use the missMDA library along with the factomineR for plotting the negative Delta Ct (log normalized) data for PCA analysis. However, there are missing value error while plotting heatmap using ComplexHeatmap library or if I need to just remove those genes for the analysis. Please assist me with this.

Best regards,

Thank you,

Toufiq

missmda missing values impute RT-qPCR heatmap • 1.3k views
ADD COMMENT
1
Entering edit mode

ComplexHeatmap can normally handle NA values. Quoting the docs:

NA is allowed in the matrix. You can control the color of NA by na_col argument (by default it is grey for NA). The matrix that contains NA can be clustered by Heatmap(). Note the NA value is not presented in the legend.

So if the problem is with ComplexHeatmap, I would investigate why this happens.

Also check out this other post with a similar question.

ADD REPLY
0
Entering edit mode

Thank you very much @ Jean-Karim Heriche.

This fixed the issue and errors.

ADD REPLY
0
Entering edit mode

Hi @Jean-Karim Heriche ,

The only limitation is when I cluster by rows = TRUE this does not work.

library(ComplexHeatmap)
library(circlize)

Neg_Dct <- read.csv(file = "./NegDCt_Subject_B1.csv",stringsAsFactors = FALSE)

## Check missing values ##
is.na(Neg_Dct)

## Create a data frame by ignoring columns 1 to 4- categorical data###
df = Neg_Dct[,-c(1:4)]

my_matrix <- as.matrix(df)

##Samples on rows and genes on columns##    
my_matrix

class(df)
[1] "data.frame"
class(my_matrix)
[1] "matrix"

my_matrix <- t(my_matrix)

## Samples on columns and genes on rows##
my_matrix 

## Calculate z-score and handling for missing values in the z-score matrix##
my_matrix_Z <- my_matrix

for(i in 1: nrow(my_matrix))
{
   my_matrix_Z[i,]<- (my_matrix[i,]-mean(my_matrix[i,], na.rm = TRUE))/sd(my_matrix[i,], na.rm = TRUE)
 }

##Samples on rows and genes on columns##     It works fine  cluster_rows = T
Heatmap(my_matrix_Z, 
                name = "z scores", 
                col = circlize::colorRamp2(c(-2, 0, 2), c("blue", "white", "red")),
                na_col = "grey",
                cluster_rows = T,
                cluster_columns = F,
                row_title_gp = gpar(fontsize = 15),
                column_title_gp = gpar(fontsize = 20),
                column_names_gp = gpar(fontsize =8),
                row_names_gp = gpar(fontsize = 4))

##Samples on columns and genes on rows##     It fails to run  cluster_rows = F
Heatmap(my_matrix_Z, 
                name = "z scores", 
                col = circlize::colorRamp2(c(-2, 0, 2), c("blue", "white", "red")),
                na_col = "grey",
                cluster_rows = F,
                cluster_columns = F,
                row_title_gp = gpar(fontsize = 15),
                column_title_gp = gpar(fontsize = 20),
                column_names_gp = gpar(fontsize =8),
                row_names_gp = gpar(fontsize = 4))

Error: Error in hclust(get_dist(t(mat), distance), method = method) : NA/NaN/Inf in foreign function call (arg 11)

ADD REPLY

Login before adding your answer.

Traffic: 1708 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6