Heatmap.2 R for differentially expressed genes DeSEQ2
1
3
Entering edit mode
8.8 years ago
User6891 ▴ 330

Hi,

I'm trying to make a heatmap for a set of differentially expressed genes. I'm using deseq2 in R together with the heatmap.2 function

This is the command I've used

dds <- DESeq(dds)
sig <- resOrdered[!is.na(resOrdered$padj) & resOrdered$padj<0.10  abs(resOrdered$log2FoldChange)>=1,]

selected <- rownames(sig)
hmcol <- colorRampPalette(brewer.pal(9, "GnBu"))(100)
heatmap.2(log2(counts(dds,normalized=TRUE)[rownames(dds) %in% selected,]), col = hmcol, scale="row", rowv = TRUE, colv = FALSE, dendrogram="row", trace="none", margin=c(4,6), cexRow=0.5, cexCol=1, keysize=1)

when running this I get the error

Error in hclustfun(distr) : NA/NaN/Inf in foreign function call (arg 11)

Any ideas how to solve this?

R • 25k views
ADD COMMENT
0
Entering edit mode

@User6891 do you want to receive help ? if yes, you should give an example input so people can reproduce the error you have and can then help you with it.

ADD REPLY
0
Entering edit mode

I'm aware of this. The problem is that 'dds' is a 'DeSeqDataSet', so it's not that easy to just reproduce this here. I suppose that some columns contains N/A values which heatmap.2 doesn't like? But it's quite normal that an expression dataset contains N/A values, so I was wondering if somebody had encountered the same problem

ADD REPLY
0
Entering edit mode

I checked & the problem is that there are some '-Inf' values in the dds matrix. Any idea how to deal with those?

ADD REPLY
0
Entering edit mode

@User6891 There are few options, for example you can replace zero instead NA, so you won't face any problem. Do this

YourDataName[is.na(YourdataName)] <- 0

and check if it works , also a similar problem was asked on stack overflow

ADD REPLY
0
Entering edit mode

Inf comes because you have zeroes into your data. Log of 0 is infinite

A general function to avoid this is yomake the zeroes = 1

Log2(counts(dds... contains some zeroes...

ADD REPLY
0
Entering edit mode

Hi @mbio.kyle,

when I try to follow you first lot of code:

> matrix<-matrix-rowMeans(matrix)

I get:

Error in rowMeans(matrix) : 
  'x' must be an array of at least two dimensions.

Please help??

ADD REPLY
0
Entering edit mode

On the previous line you're missing a comma to index rld by topVarGenes in two dimension way:

# use the log transform on the data set
matrix <- assay(rld)[ topVarGenes, ]
matrix <- matrix - rowMeans(matrix)
ADD REPLY
12
Entering edit mode
8.8 years ago
mbio.kyle ▴ 380

You do not want to make heatmaps from the DeSeqDataSet object. Here is an example workflow. Note it uses the pheatmap R library

# make the DeSeqDataSet
dds <- DESeq(dds)

# use the log transform on the data set
rld <- rlog(dds, blind=F)
topVarianceGenes <- head(order(rowVars(assay(rld)), decreasing=T),100)
matrix <- assay(rld)[ topVarGenes ]
matrix <- matrix - rowMeans(matrix)

# select the 'contrast' you want
annotation_data <- as.data.frame(colData(rld)[c("ConditionA","ConditionB")])
pheatmap(matrix, annotation_col=annotation_data)

The above will plot the most varied genes across all conditions. You can substitute the 100 for whichever number you like to control how many you get. If you have alot of different conditions/contrasts and want to make some more specific heatmaps, here is a function which will plot the top 20 genes for the up and down direction (i.e. largest log2FoldChange between the two conditions up and down)

plotUpDownSigGenes <- function(results, colNums, rld, title) {

    # make the lists
    upgenes <- rownames(head(results[ order( results$log2FoldChange ), ], n=20))
    downgenes <- rownames(head(results[ order( -results$log2FoldChange ), ], n=20))

    # this gives us the rows we want
    rows <- match(upgenes, row.names(rld))
    mat <- assay(rld)[rows,colNums]
    mat <- mat - rowMeans(mat)

    # the labels are hard coded at the moment :(
    df <- as.data.frame(colData(rld)[c("labelA","labelB")])
    pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 up genes"))

    # this gives us the rows we want
    rows <- match(downgenes, row.names(rld))
    mat <- assay(rld)[rows,colNums]
    mat <- mat - rowMeans(mat)

    df <- as.data.frame(colData(rld)[c("labelA","labelB")])
    pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 down genes"))
}

Here is an example call for the function:

contrastDEGenes <- subset(results(dds, contrast=c("A","B")), padj < 0.05)

# this part is kind of funky
# the function needs to know which columns
# correspond to the samples (to pull from rld)
aCols <- c(1,2,3)
bCols <- c(4,5,6)

# get the log transforms again
rld <- rlog(dds, blind=F)

# call
plotUpDownSigGenes(
    contrastDEGenes,
    c(aCols, bCols),
    rld,
    "Title for the plot"
)

Hope it helps!

ADD COMMENT
0
Entering edit mode

Thanks for the code.

I am trying to run your function and I am getting the following error:

Error in checkContrast(contrast, resNames) : 
  'contrast', as a character vector of length 3, should have the form:
contrast = c('factorName','numeratorLevel','denominatorLevel')

I am not sure how to proceed. Could you please help me?

ADD REPLY
0
Entering edit mode

The code has a error, there should be 3 values for constrat in the results funcition.

You have to replace 'factorName' with a column name, then replace 'numeratorLevel' and 'denominatorLevel' with the two values of the column you want to compare. Using the previous example:

contrastDEGenes <- subset(results(dds, contrast=c("Condition", "A", "B")), padj < 0.05)
ADD REPLY

Login before adding your answer.

Traffic: 2070 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6