Heatmap deseq2
1
2
Entering edit mode
3.0 years ago
bart ▴ 50

I'm using deseq2 for DEA but when I create a heatmap with only DEGs, it looks very strange: I'm not sure whether there are only overexpressed genes or whether the dataset is not normalized properly. I probably made a mistake somewhere in my coding but I don't know where to look. Help would be appreciated!

#create deseq object, normalize, use pre-filtering to remove genes with <5 counts in >90% of samples
dds<-DESeqDataSetFromMatrix(df,colData =group, design = ~group)
dds <- estimateSizeFactors(dds)
badgenes<-names(which(apply(counts(dds), 1, function(x){sum(x < 5)}) > 0.9 * ncol(dds)))
ddsFiltered <- dds[which(!rownames(dds) %in% badgenes), ]
#perform deseq analysis, prevent deseq from inserting p-adj values which are NA, insert p-adj values, subset all DEGs 
ddsFiltered<-DESeq(ddsFiltered)
res<-results(ddsFiltered, cooksCutoff=FALSE, independentFiltering=FALSE)
filtered<-counts(ddsFiltered) 
filtered<-as.data.frame(filtered)
filtered<-filtered%>%mutate(padj=res$padj)
all_diff_genes <-subset(filtered,filtered$padj<0.05)
#create heatmap with only the DEGs
rld <- vst(ddsFiltered, blind=FALSE)
de<- rownames(res[res$padj<0.05, ])
de_mat <- assay(rld)[de,]
pheatmap(de_mat,show_rownames = F,show_colnames = F,annotation_col =group)

enter image description here

deseq2 heatmap • 5.3k views
ADD COMMENT
1
Entering edit mode
3.0 years ago
ATpoint 86k

This is pretty much the same as explained here: Scaling RNA-Seq data before clustering?

You did not scale the heatmap, hence it clusters by expression level rather than relative differences.

ADD COMMENT
0
Entering edit mode

Thanks for the response! I scaled the heatmap like so:

rld <- vst(ddsFiltered, blind=FALSE)
de<- rownames(res[res$padj<0.05, ])
de_mat <- assay(rld)[de,]
pheatmap(t(scale(t(de_mat))),show_rownames = F,show_colnames = F,annotation_col =group)

However, the clustering is still very poor using heatmaps or PCA and I dont really know why. It could be that there are confounding factors such as age and gender etc, however when I add these to the formula:

DESeqDataSetFromMatrix(df[,-175],colData =newgroup, design = ~group+sex+age)

and use these in the heatmap function:

pheatmap(t(scale(t(de_mat))),show_rownames = F,show_colnames = F,annotation_col =newgroup)

the amount of DEGs drastically drops from 2000 to 32 and clustering does not improve. NB: group means cancer group (number 1 in heatmap) or no cancer (number 2 in heatmap). So I am out of ideas what could cause the clustering problems. Do you have any ideas what might be the problem here?

enter image description here enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6