Entering edit mode
4.3 years ago
Rob
▴
170
Hello Does anyone know that what is the package or basis that GSEA software uses for plotting heatmap?
Hello Does anyone know that what is the package or basis that GSEA software uses for plotting heatmap?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
You have to give the expression values (normalized reads values obtained from DESeq2, EdgeR etc) as input to GSEA for plotting heatmap. Based on the expression values for each gene in all samples, it provides different shades of color from red to blue for low to high expression respectively.
Hello Thank you. Should I give only differentially expressed gene to GSEA or all genes? Also, I saw that GSEA heatmap is only for 100 top genes. How can I do this for all of my genes or all of differentially expressed genes? What do you mean by "normalized rread count from edgeR or DESeq2"? Is that RSEM data or log transformed data? Is that possible to use this software for heatmap and change the setting for heatmap?
Why to use all genes for heatmap? it will not remain informative in that case. Thus, usually it is a practice to plot heatmap with following:
1) Considering 50-100 most significant differential expressed genes based on p-value or q-value/p-adjusted values 2) Considering top most up-regulated (25-50 genes) and down-regulated genes (25-50 genes) based on logfold change
If you are familiar with R, then you can use pheatmap package where you can customize the parameters to suit your need and if not you can try online webserver heatmapper or Clustvis for plotting heatmap.
To ensure samples are comparable, read mapped count obtained for each gene/feature needs to be normalized before differential expression analysis. So, if you have a list of diferentially expressed genes, then you must also have their normalized expression values, either in form of basemean, FPKM/RPKM or TPM etc.
Can you tell more about the method you have used for getting differential expressed genes?
Hello, Thanks
I used edgeR and DESeq2 packages in R for differential expression analysis. I think the result with DESeq2 is more reliable. What is your idea?
It will not be fair to say DESeq2 is more reliable. It totally depends upon the type of samples and number of replicates you are working with. This paper can give you more insight.
Thank you so much. I have RSEM data, raw count data, HT-Seq data, I do not know which one should I use. My supervisor asked me to use HT-Seq but with both edgeR and DESeq2 I did not get any sig FDR value for 20000 genes.(I rounded HT-Seq before importing to R to be compatible with DESeq2 and edgeR). with raw data and RSEM, I get some results sig diff exp genes but no very good heatmaps. He also told not use TPM or RPKM data I dont know why. I have two grups each has 22 patients. overall 44 sample. What is your ide about the datasets? Why my heatmaps did not have clear pattern while I have genes diff expressed?
this is the code I used for diff expression and heatmap: I tried with and without z-score and log-transformation. got no good pattern.
Reading in raw data
rdata <- read.table("mydata.txt", header = TRUE, row.names = 1)
library(pheatmap) library(DESeq2)
Differential abundance
alpha <- 0.01 #set the cutoff value
Create metadata
sample_org <- data.frame(row.names = colnames(rdata), c(rep("0", 22), rep("1", 22))) colnames(sample_org) <- c("Group")
dds <- DESeqDataSetFromMatrix(countData = rdata, colData = sample_org, design = ~Group)
dd <- DESeq(dds) res <- results(dd)
subset only significant genes
sig <- res[res$padj < alpha,] sig_genes <- rownames(sig) subset <- rdata[sig_genes,]
log transform data for visualization
tdata <- log2(subset + 0.5) mat <- as(rdata, "matrix")
row Z-score
m_tr_z_score <- t(scale(t(mat)))
Set colours
my_colour = list( Group = c("0" = "blue", "1" = "yellow"))
Plot
pheatmap(symbreaks = FALSE,cluster_cols = FALSE, cluster_rows = TRUE,color = colorRampPalette(c("#f71616","#f71616","white", "#1919d4", "#1919d4"))(100),annotation_col = sample_org, annotation_colors = my_colour, mat, scale = "row")
in this line of code i get warning message of converting data to factor but I dont think it has effect on results: dds <- DESeqDataSetFromMatrix(countData = rdata, colData = sample_org, design = ~Group)