Extract gene names from a particular cluster from pheatmap
1
0
Entering edit mode
6.7 years ago

Hi Biostars,

My question might seem very redundant, but I havn't come across a solution for my particular case (I guess I miss a small detail).

So, with pheatmap I have generated a heapmap of ca 7.5k genes. The code is

heat<-pheatmap(LFC_human, annotation_col=human_conditions,cluster_rows=TRUE, show_rownames=FALSE,cluster_cols=TRUE, 
         border_color = NA, scale = "row",
         color=greenred(75),main="Title")

Here is the heatmap

human heatmap LFC overlap genes reordered biostars

My question is - how I can get gene names of a specific cluster (highlighted in blue)?

I know I can use

hc <-heat$tree_row
lbl <- cutree(hc, 5) # split gene dendrogram in 5 groups
which(lbl==1) # grab genes of first group

but how do I know to which clusters this 5 groups correspond to on my dendrogram? And since there are a lot of genes, any visual inspection is problematic. Maybe I should use a different package for heatmaps?

Thanks

pheatmap cluster R • 11k views
ADD COMMENT
1
Entering edit mode

I believe we do have clustering gurus in Biostars:)

ADD REPLY
0
Entering edit mode

I hope @Kevin_Blighe will hear me :)

ADD REPLY
1
Entering edit mode

Not unless you tag him: Kevin Blighe

ADD REPLY
0
Entering edit mode

Ah, thanks @genomax, wan't aware about this functionality.

ADD REPLY
1
Entering edit mode

I embedded the image you had linked in original post. For future reference: How to add images to a post

ADD REPLY
0
Entering edit mode

Thanks! For some reason only cubeupload works for me.

ADD REPLY
0
Entering edit mode

A very dirty way is to print the graph as a PDF with the gene IDs on the right of the Heat map. You can then highlight the geneID of your choice and paste them in your favorite text editor....

ADD REPLY
4
Entering edit mode
6.7 years ago

Did you look at this previous answer? - A: extract dendrogram cluster from pheatmap

Specifically search for the text "#Re-order original data (genes) to match ordering in heatmap (top-to-bottom)" in that thread.

By knowing the exact order of the genes in your dendrogram/heatmap, you should be able to combine that information with the output of cutree and, thus, creating a 2-column data-frame of the sample-to-cluster assignment.

I do prefer ComplexHeatmap, as it gives greater flexibility all round!

ADD COMMENT
0
Entering edit mode

Hi Kevin, Thanks for reply! Yes I have seen that post before, but as far as I understand at some point you should do some manual inspection to see where the cutree cuts the tree and whether it corresponds to the cluster I need (and since there are a lot of genes I thought it might be problematic). But I will try anyway. Ideally I wanted something similar to the picture below (with assigned clusters on the left)enter image description here

ADD REPLY
0
Entering edit mode

I believe that cutree assigns the clusters based on the tree merge height, i.e., the height at which two samples or sample groups merge, which is useful to know. When there are many genes, it can indeed be difficult.

I still prefer ComplexHeatmap, though. The tutorial on the Bioconductor website is great, and I have also put code on BIostars as answers in various threads. Inparticular, you may be interested in how the heat map is segregated into different clusters, as you can see here:

That is done by PAM or means clustering, or you can first use cutree and then assign the sample-to-cluster output from cutter to the ComplexHeatmap Heatmap() function (via the split parameter) in order to split the heat map based on how cutree has identified the groupings.

ADD REPLY

Login before adding your answer.

Traffic: 1207 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6