Question

How to annotate only selected genes on a heatmap

4

Entering edit mode

7.5 years ago

h.fushimi.x689 ▴ 40

Hello, all. Do someone happen to know how to annotate only selected genes in a heatmap?

I use heatmap.2 in gplots package and can annotate all genes in the data. It is okay as far as the genes are not so many. But when I plot, say, 100 genes, the annotations can't be read. So I want to annotate only selected genes, like Fig3. A on this article .

Please let me know how to do that. Thanks in advance.

R • 14k views

ADD COMMENT • link updated 15 months ago by GenoMax 147k • written 7.5 years ago by h.fushimi.x689 ▴ 40

0

Entering edit mode

My guess is that these annotations are added manually. But there might be a package in R for it, which I am not aware of yet.

ADD REPLY • link 7.5 years ago by Benn 8.3k

0

Entering edit mode

can you try this one (with links):

   data(mtcars)
    x  <- as.matrix(mtcars)
    labels = rownames(x)[c(1,4,5)]
    Heatmap(x, show_row_names = FALSE, show_row_dend = FALSE, show_column_dend = FALSE) + rowAnnotation(link = row_anno_link(at = c(1,4,5), labels = labels),
                    width = unit(1, "cm") + max_text_width(labels))

ADD REPLY • link updated 15 months ago by GenoMax 147k • written 7.5 years ago by cpad0112 21k

score 2 · Answer 1 · 2017-06-19

2

Entering edit mode

7.5 years ago

poisonAlien ★ 3.2k

ComplexHeatmap can do this. I believe article uses the same.

ADD COMMENT • link 7.5 years ago by poisonAlien ★ 3.2k

1

Entering edit mode

library(ComplexHeatmap)
data(mtcars)
x  <- as.matrix(mtcars)
labels = rownames(x)[c(1,4,5)]
Heatmap(x[,c(9:11)], show_row_names = FALSE, show_row_dend = FALSE, show_column_dend = F, show_heatmap_legend =F) + rowAnnotation(link = row_anno_link(at = c(1,4,5), labels = labels), width = unit(1, "cm") + max_text_width(labels))

ADD REPLY • link 7.5 years ago by cpad0112 21k

score 1 · Answer 2 · 2017-06-19

Hi h.fushimi.x689,

When you write annotate genes, do you mean transcripts? (if so, there is a risk that you will have multiple transcripts that are encoded by the same gene)

This is a quick and dirty method that I have used to get a "quick" overview of my transcripts:

1) Extract the IDs for the transcripts represented in your heatmap (this is an example where the first column is the ID) and write them into a txt file (in R): IDs <- as.data.frame(df[,1]) write.table(IDs, file="IDs.txt", row.names=FALSE, col.names=FALSE, quote=FALSE, sep",")

2) In the terminal (if you have not already done it), remove the "description" from your fasta file - and keep only the ID's (Trinity IDs in this case) sed -e 's/^\(>[^[:space:]]*\).*/\1/' my.fasta > mymodified.fasta

3) Extract sequences based on the ID's extracted from R:

sudo pip install pyfaidx

xargs faidx input.fasta < IDs.txt > output.fasta

4) Load the output fasta into the free version of Blast2Go: https://www.blast2go.com/

5) Copy paste the Transcript IDs and the annotation to each of them into a data frame (I used excel) and save it as csv (for example annotation.csv)

6) Load the annotation.csv into R and merge it with the data-frame containing your heatmap data (annotations = annotation.csv):

annotation <- merge(annotations, tmp_df, by="target_id")

#Set the target_id's as row names - Description in this case is the annotation description obtained from blast2go
rownames(annotation) <- annotation$Description 

anno<-annotation[,-c(1:2)] # delete column 1 & 2 containing target id's and description not used as row_names

#Heatmap2
## Row clustering (adjust here distance/linkage methods to what you need!)
hp <- hclust(as.dist(1-cor(t(anno), method="pearson")), method="complete")

## Column clustering (adjust here distance/linkage methods to what you need!)
hs <- hclust(as.dist(1-cor(anno, method="pearson")), method="complete")

#Make a 6x8 inch image at 600dpi:
ppi <- 600
png("myheatmap.png", width=10*ppi, height=6*ppi, res=ppi)
heatmap.2(as.matrix(anno), 
          Rowv=as.dendrogram(hp), 
          Colv=as.dendrogram(hs), 
          scale="row", 
          density.info="none", 
          trace="none",
          cexRow=0.7, cexCol = 0.8,
          col=bluered(75),
          margins = c(6,27),
          keysize=1,
          key.par = list(cex=0.7),
          dendrogram="column")
dev.off()

Cheers, B

score 1 · Answer 3 · 2017-06-19

1

Entering edit mode

7.5 years ago

Michael 55k

The easy way: use Row-labels and set all but those you want to show to empty string:

data(mtcars)
x  <- as.matrix(mtcars)
labRow <- c(row.names(x)[1], rep('', length(row.names(x))-1)) # take just the first name, here you can choose the ones you like
heatmap.2(x, labRow = labRow)

Result: a heatmap with only Mazda RX4 :)

ADD COMMENT • link 7.5 years ago by Michael 55k

1

Entering edit mode

lastline:

heatmap.2(x, labRow = labRow)

ADD REPLY • link 7.5 years ago by cpad0112 21k

0

Entering edit mode

fixed, thank you

ADD REPLY • link 7.5 years ago by Michael 55k

0

Entering edit mode

Thanks Michael. This is almost I want, and works well in many cases. If possible, I want to draw leading lines with some scripts. Because 1) with many genes, i.e. very narrow rows, it is somewhat difficult to precisely identify, or draw a line by hand to the row which the label annotates. 2) in some cases, selected genes locate closely and the labels overlap.

ADD REPLY • link 7.5 years ago by h.fushimi.x689 ▴ 40

0

Entering edit mode

Sorry, drawing lines is a bit more difficult. How about something like this:

 heatmap.2(x, labRow = labRow, rowsep = c(9:10), sepwidth = c(0.05,0.05), sepcolor = 'blue')

You just have to find your gene of interest in the dendrogram order.

ADD REPLY • link 7.5 years ago by Michael 55k