You do not want to make heatmaps from the DeSeqDataSet object. Here is an example workflow. Note it uses the pheatmap
R library
# make the DeSeqDataSet
dds <- DESeq(dds)
# use the log transform on the data set
rld <- rlog(dds, blind=F)
topVarianceGenes <- head(order(rowVars(assay(rld)), decreasing=T),100)
matrix <- assay(rld)[ topVarGenes ]
matrix <- matrix - rowMeans(matrix)
# select the 'contrast' you want
annotation_data <- as.data.frame(colData(rld)[c("ConditionA","ConditionB")])
pheatmap(matrix, annotation_col=annotation_data)
The above will plot the most varied genes across all conditions. You can substitute the 100 for whichever number you like to control how many you get. If you have alot of different conditions/contrasts and want to make some more specific heatmaps, here is a function which will plot the top 20 genes for the up and down direction (i.e. largest log2FoldChange between the two conditions up and down)
plotUpDownSigGenes <- function(results, colNums, rld, title) {
# make the lists
upgenes <- rownames(head(results[ order( results$log2FoldChange ), ], n=20))
downgenes <- rownames(head(results[ order( -results$log2FoldChange ), ], n=20))
# this gives us the rows we want
rows <- match(upgenes, row.names(rld))
mat <- assay(rld)[rows,colNums]
mat <- mat - rowMeans(mat)
# the labels are hard coded at the moment :(
df <- as.data.frame(colData(rld)[c("labelA","labelB")])
pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 up genes"))
# this gives us the rows we want
rows <- match(downgenes, row.names(rld))
mat <- assay(rld)[rows,colNums]
mat <- mat - rowMeans(mat)
df <- as.data.frame(colData(rld)[c("labelA","labelB")])
pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 down genes"))
}
Here is an example call for the function:
contrastDEGenes <- subset(results(dds, contrast=c("A","B")), padj < 0.05)
# this part is kind of funky
# the function needs to know which columns
# correspond to the samples (to pull from rld)
aCols <- c(1,2,3)
bCols <- c(4,5,6)
# get the log transforms again
rld <- rlog(dds, blind=F)
# call
plotUpDownSigGenes(
contrastDEGenes,
c(aCols, bCols),
rld,
"Title for the plot"
)
Hope it helps!
@User6891 do you want to receive help ? if yes, you should give an example input so people can reproduce the error you have and can then help you with it.
I'm aware of this. The problem is that 'dds' is a 'DeSeqDataSet', so it's not that easy to just reproduce this here. I suppose that some columns contains N/A values which heatmap.2 doesn't like? But it's quite normal that an expression dataset contains N/A values, so I was wondering if somebody had encountered the same problem
I checked & the problem is that there are some '-Inf' values in the dds matrix. Any idea how to deal with those?
@User6891 There are few options, for example you can replace zero instead NA, so you won't face any problem. Do this
and check if it works , also a similar problem was asked on stack overflow
Inf comes because you have zeroes into your data. Log of 0 is infinite
A general function to avoid this is yomake the zeroes = 1
Log2(counts(dds... contains some zeroes...
Hi @mbio.kyle,
when I try to follow you first lot of code:
I get:
Please help??
On the previous line you're missing a comma to index rld by topVarGenes in two dimension way: