Help fixing my DGE heatmap?
1
0
Entering edit mode
7.0 years ago
Tania ▴ 180

Hi Everyone

I need your help fixing my heat map. My top 10 differential expressed genes are not the ones that show up in the heatmap. The ones that show in the heat map are the first 10 ordered in the samples counts, although they are not significant at all.
Below is a snapshot of my code and I added my heatmap too: https://ibb.co/gPKepm

labels=c("C1", "C2", "C3", "C4", "C5", "C6","C7", "C8", "C9", "C10", "T1","T2","T3","T4","T5","T6","T7","T8","T9","T10")
dge = DGEList(counts=cts, genes= rownames(cts), group=group)
countsPerMillion <- cpm(dge, prior.count=2, log=TRUE)
CPM = countsPerMillion
colnames(CPM) = labels
o <- order(etp$table$PValue)
CPM <- CPM[o[1:10],]
CPM <- t(scale(t(CPM)))
col.pan <- colorpanel(50, "blue", "white", "red")
heatmap.2(CPM, col=col.pan, Rowv=TRUE, scale="none",trace="none", dendrogram="none", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6))

Thanks

RNA-Seq Heatmap • 2.6k views
ADD COMMENT
1
Entering edit mode

Which numbers are the first ten of "o"? Did you try to associate them to gene names? Which gene names were associated to them? Are those the same that you see in the heatmap? Since I am not an expert, it is really hard to help more by only looking at the code.

ADD REPLY
0
Entering edit mode

The o contains 1 to 10 and those are the genes which show up in the heatmap. The first 10 in etp before order which are not significant. This is what I can't fix. So the order in o is wrong I guess.

ADD REPLY
0
Entering edit mode

When I changed O to:

o = order(etp$table$PValue, na.last=TRUE,decreasing= TRUE, method="radix")

I got another list of genes that are significant but not the top either.

ADD REPLY
1
Entering edit mode

Try without decreasing=TRUE (you want the smaller p being the first one, not the last one).

ADD REPLY
0
Entering edit mode

I tried it and still picks the wrong order. It doesn't even pick from top significant or least significant so I think of reversing, it picks some genes from the middle.

ADD REPLY
4
Entering edit mode
7.0 years ago

This doesn't work because the gene/transcript indices in the etp object won't match those in your CPM object. You should instead subset CPM based on gene/transcript names.

Wherever you ran the topTags function, the genes should automatically be ordered by PValues there (lowest to highest). You can then just get the rownames of the resulting object and use those to subset CPM.

For example:

topGenes <- rownames(topTags(et, n=10))

CPM[topGenes,]

Take a look at the tutorial here: RNA Sequence Analysis in R: edgeR

ADD COMMENT
0
Entering edit mode

Thanks Kevin and Fabio. I think it works now, although it is kinda sparse (not sure if this is okay). [https://ibb.co/b0kbn6]

ADD REPLY
1
Entering edit mode

It looks like better scaling of the data would help. You may want to take some ideas from my various postings on heatmaps:

You already scale your data to Z scores but perhaps specifying breaks (like -1, +1) in addition would help.

ADD REPLY
0
Entering edit mode

Great, thanks so much Kevin, very helpful :)

ADD REPLY
1
Entering edit mode

Hi Tania. You are welcome

ADD REPLY
0
Entering edit mode

Great, thanks so much Kevin, very helpful :)

ADD REPLY
0
Entering edit mode

Great, thanks so much Kevin, very helpful :)

ADD REPLY

Login before adding your answer.

Traffic: 2279 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6