Can't find a pattern in heatmap, is it okay?
0
0
Entering edit mode
6.8 years ago
Tania ▴ 180

Hi all

Does this heatmap look okay to you? Why I can't see any pattern ? https://ibb.co/m8qaQx Here is a snapshot of my code:

cts <- txi$counts
group = factor(c(rep("Control", 10), rep("Tumor",10)) )
dge = DGEList(counts=cts, genes= rownames(cts), group=group)
countsPerMillion <- cpm(dge)  
summary(countsPerMillion)
countCheck <- countsPerMillion > 1
summary(countCheck)
keep <- which(rowSums(countCheck) >= 2)
dge <- dge[keep,]
summary(cpm(dge))
dge <- calcNormFactors(dge, method="TMM")
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
dge <- estimateTrendedDisp(dge)
et <- exactTest(dge, pair=c("Control", "Tumor"))
etp <- topTags(et, n= 100000, adjust.method="BH", sort.by="PValue")
#################################################################
###PLOTTING STARTS HERE
#################################################################

logCPM = countsPerMillion
o = rownames(etp$table[abs(etp$table$logFC)>1 & etp$table$PValue<0.05, ])
logCPM <- logCPM[o[1:100],]
colnames(logCPM) = labels
logCPM <- t(scale(t(logCPM)))
write.csv(logCPM, "ControlTumorCPM.csv")
require("RColorBrewer")
require("gplots")
myCol <- colorRampPalette(c("white", "darkgreen", "red"))(100)
myBreaks <- seq(-3, 3, length.out=101)
heatmap.2(logCPM, col=myCol, breaks=myBreaks, Rowv=TRUE,Colv=TRUE, main="Controls vs Tumors Heatmap", key=T, keysize=0.7,scale="none",trace="none", dendrogram="both", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6),reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean),  distfun=function(x) as.dist(1-cor(t(x))), hclustfun=function(x) hclust(x, method="ward.D2"))
dev.off()

Thanks

RNA-Seq heatmap • 2.9k views
ADD COMMENT
2
Entering edit mode

Your changes are imbalanced (there's almost no downregulated genes). Could you confirm that the data have been log-transformed and/or row-scaled. I'd strongly recommend you pick a colourscheme that will highlight differences between samples: I tend to avoid green-red anyway, but it seems a bit perverse to use green for unchanged, red for upregd and white for downregd - I'd have suggested using white as the midpoint

ADD REPLY
0
Entering edit mode

Thanks so much. I think I did the logscale but not sure about the rowscale?, here is a snapshot of the code, we have down regulation, but most of the top significant genes are up regulated not down regulated, I am plotting the top 100?

logCPM = countsPerMillion
o = rownames(etp$table[abs(etp$table$logFC)>1 & etp$table$PValue<0.05, ])
logCPM <- logCPM[o[1:100],]
colnames(logCPM) = labels
logCPM <- t(scale(t(logCPM)))
write.csv(logCPM, "ControlTumorCPM.csv")
require("RColorBrewer")
require("gplots")
myCol <- colorRampPalette(c("white", "darkgreen", "red"))(100)
myBreaks <- seq(-3, 3, length.out=101)
heatmap.2(logCPM, col=myCol, breaks=myBreaks, Rowv=TRUE,Colv=TRUE, main="Controls vs Tumors Heatmap", key=T, keysize=0.7,scale="none",trace="none", dendrogram="both", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6),reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean),  distfun=function(x) as.dist(1-cor(t(x))), hclustfun=function(x) hclust(x, method="ward.D2"))
dev.off()
ADD REPLY
1
Entering edit mode

Could you put your code into your original question please, so it's a bit more readable

ADD REPLY
0
Entering edit mode

I changed it, thanks

ADD REPLY
0
Entering edit mode

rerun it with the line logCPM <- log2(countsPerMillion)

ADD REPLY
0
Entering edit mode

The above line throws the following error,

Error in hclust(x, method = "ward.D2") : NA/NaN/Inf in foreign function call (arg 11)

So I changed this line instead:

countsPerMillion <- cpm(dge, prior.count=2, log=TRUE)

Then it becomes like this: https://ibb.co/nCNsoH

Do you think it is fixed now? Thanks

ADD REPLY
1
Entering edit mode

Yes it looks ok.

I'd rewrite your script now that it works, so that your variable names reflect what they actually hold; so use logCPM instead of countsPerMillion if you're storing the logged values etc

Please note that you are filtering on p-values rather than FDR, so your results aren't very stringent

ADD REPLY
0
Entering edit mode

Thanks russhh so much :)

ADD REPLY
1
Entering edit mode

Are you expecting to see one? Why?

Heatmaps are just a visualisation tool, not really an analysis tool.

Besides, there's a marked difference between your tumor and WT control as far as I can tell. YOu should probably do an ontology exploration to figure out the story behind your data.

ADD REPLY
0
Entering edit mode

Thanks so much. In the data most of the significant genes, are upregulated not down, I plot only the top 100 in the heatmap.

ADD REPLY

Login before adding your answer.

Traffic: 1904 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6