You do not want to make heatmaps from the DeSeqDataSet object. Here is an example workflow. Note it uses the pheatmap
R library
dds <- DESeq(dds)
rld <- rlog(dds, blind=F)
topVarianceGenes <- head(order(rowVars(assay(rld)), decreasing=T),100)
matrix <- assay(rld)[ topVarGenes ]
matrix <- matrix - rowMeans(matrix)
annotation_data <- as.data.frame(colData(rld)[c("ConditionA","ConditionB")])
pheatmap(matrix, annotation_col=annotation_data)
The above will plot the most varied genes across all conditions. You can substitute the 100 for whichever number you like to control how many you get. If you have alot of different conditions/contrasts and want to make some more specific heatmaps, here is a function which will plot the top 20 genes for the up and down direction (i.e. largest log2FoldChange between the two conditions up and down)
plotUpDownSigGenes <- function(results, colNums, rld, title) {
upgenes <- rownames(head(results[ order( results$log2FoldChange ), ], n=20))
downgenes <- rownames(head(results[ order( -results$log2FoldChange ), ], n=20))
rows <- match(upgenes, row.names(rld))
mat <- assay(rld)[rows,colNums]
mat <- mat - rowMeans(mat)
df <- as.data.frame(colData(rld)[c("labelA","labelB")])
pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 up genes"))
rows <- match(downgenes, row.names(rld))
mat <- assay(rld)[rows,colNums]
mat <- mat - rowMeans(mat)
df <- as.data.frame(colData(rld)[c("labelA","labelB")])
pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 down genes"))
}
Here is an example call for the function:
contrastDEGenes <- subset(results(dds, contrast=c("A","B")), padj < 0.05)
aCols <- c(1,2,3)
bCols <- c(4,5,6)
rld <- rlog(dds, blind=F)
plotUpDownSigGenes(
contrastDEGenes,
c(aCols, bCols),
rld,
"Title for the plot"
)
Hope it helps!
@User6891 do you want to receive help ? if yes, you should give an example input so people can reproduce the error you have and can then help you with it.
I'm aware of this. The problem is that 'dds' is a 'DeSeqDataSet', so it's not that easy to just reproduce this here. I suppose that some columns contains N/A values which heatmap.2 doesn't like? But it's quite normal that an expression dataset contains N/A values, so I was wondering if somebody had encountered the same problem
I checked & the problem is that there are some '-Inf' values in the dds matrix. Any idea how to deal with those?
@User6891 There are few options, for example you can replace zero instead NA, so you won't face any problem. Do this
and check if it works , also a similar problem was asked on stack overflow
Inf comes because you have zeroes into your data. Log of 0 is infinite
A general function to avoid this is yomake the zeroes = 1
Log2(counts(dds... contains some zeroes...
Hi @mbio.kyle,
when I try to follow you first lot of code:
I get:
Please help??
On the previous line you're missing a comma to index rld by topVarGenes in two dimension way: