Question

heatmap in R

0

Entering edit mode

4.8 years ago

Rob ▴ 180

hi friends I did differential expression analysis for HT-Seq data and I want to plot heatmap for significant genes. But my heatmap is not good. I compared two groups and my workflow is as below:

rdata <- read.table("myData.txt", header = TRUE, row.names = 1, stringsAsFactors = FALSE)

library(pheatmap)
library(DESeq2)
library(ggplot2)

## Differential abundance
alpha <- 0.05 #set the cutoff value
##fold change cut-off
beta <- 2 
## Create metadata 
sample_org <- data.frame(row.names = colnames(rdata), c(rep("0", 22), rep("1", 22)))
colnames(sample_org) <- c("Group")

dds <- DESeqDataSetFromMatrix(countData = rdata,
                              colData = sample_org,
                              design = ~Group)

dd <- DESeq(dds)


res <- results(dd)
#Saving results
write.csv(res,"res.csv")
## remove NAs
res2 <- res[!is.na(res$padj),]
###########################################################

## subset only significant genes
sig <- res2[res2$padj < alpha,]
sig2 <- sig[sig$log2FoldChange > beta,]
sig_genes <- rownames(sig2)
subset <- rdata[sig_genes,]
## log transform data for visualization
tdata <- log2(subset + 0.5)
mat <- as(tdata, "matrix")

## Set colours
my_colour = list(Group = c("0" = "blue", "1" = "yellow"))

#################
pheatmap(symbreaks = FALSE, cluster_cols = FALSE,
         cluster_rows = TRUE,color = colorRampPalette(c("#f71616","#f71616","white",
                                                        "#1919d4", "#1919d4"))(100),annotation_col = sample_org, annotation_colors = my_colour, mat, scale = "row")

I want a heatmap like this

enter image description here

that clearly show difference between two groups, over represented genes together and under represented genes together:

My heatmap is here : enter image description here

with no clear pattern

Any body can help me with this plz?

rna-seq DESeq2 R • 1.9k views

ADD COMMENT • link updated 4.8 years ago by luiz.rocha ▴ 30 • written 4.8 years ago by Rob ▴ 180

score 4 · Accepted Answer · 2020-09-25

4

Entering edit mode

4.8 years ago

Jean-Karim Heriche 27k

The images you want to show are not linked. Also you seem to be plotting the log of the raw data. Try using the normalized counts instead or the log fold change since this is usually what we're interested in. Check the DESeq2 vignette section on heatmaps.

ADD COMMENT • link 4.8 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Hello Jean-Karim Can you explain how should I have normalized count? I usually use: log2(data + 0.5) to normalize my data. I dont know how to do normalized count?

Also, what do you mean by "use log fold change"? do you mean instead of FDR cut off for selection significant genes I use fold change cut off?

ADD REPLY • link 4.8 years ago by Rob ▴ 180

0

Entering edit mode

DESeq2 does normalization by dividing the counts by a sample-specific factor (details can be found in the DESeq paper). You can access normalized counts with the counts() function using the argument normalized=TRUE. Check the package documentation and the vignette. My suggestion was to also/alternatively visualize the log fold changes as heatmap.

ADD REPLY • link 4.8 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Thanks Karim Is there any example about visualizing log fold change as heatmap? because I still dont get this part. at the moment I have significant genes as rows and patient IDs as columns. I dont know how to replace these with log fold change?

ADD REPLY • link 4.8 years ago by Rob ▴ 180

score 3 · Accepted Answer · 2020-09-25

3

Entering edit mode

4.8 years ago

luiz.rocha ▴ 30

You can try to use scale() instead of log or change your clustering strategy for rows.

ADD COMMENT • link 4.8 years ago by luiz.rocha ▴ 30