How to annotate only selected genes on a heatmap
3
4
Entering edit mode
7.4 years ago

Hello, all. Do someone happen to know how to annotate only selected genes in a heatmap?

I use heatmap.2 in gplots package and can annotate all genes in the data. It is okay as far as the genes are not so many. But when I plot, say, 100 genes, the annotations can't be read. So I want to annotate only selected genes, like Fig3. A on this article .

Please let me know how to do that. Thanks in advance.

R • 14k views
ADD COMMENT
0
Entering edit mode

My guess is that these annotations are added manually. But there might be a package in R for it, which I am not aware of yet.

ADD REPLY
0
Entering edit mode

can you try this one (with links):

   data(mtcars)
    x  <- as.matrix(mtcars)
    labels = rownames(x)[c(1,4,5)]
    Heatmap(x, show_row_names = FALSE, show_row_dend = FALSE, show_column_dend = FALSE) + rowAnnotation(link = row_anno_link(at = c(1,4,5), labels = labels),
                    width = unit(1, "cm") + max_text_width(labels))
ADD REPLY
2
Entering edit mode
7.4 years ago
poisonAlien ★ 3.2k

ComplexHeatmap can do this. I believe article uses the same.

ADD COMMENT
1
Entering edit mode
library(ComplexHeatmap)
data(mtcars)
x  <- as.matrix(mtcars)
labels = rownames(x)[c(1,4,5)]
Heatmap(x[,c(9:11)], show_row_names = FALSE, show_row_dend = FALSE, show_column_dend = F, show_heatmap_legend =F) + rowAnnotation(link = row_anno_link(at = c(1,4,5), labels = labels), width = unit(1, "cm") + max_text_width(labels))
ADD REPLY
1
Entering edit mode
7.4 years ago
BioBing ▴ 150

Hi h.fushimi.x689,

When you write annotate genes, do you mean transcripts? (if so, there is a risk that you will have multiple transcripts that are encoded by the same gene)

This is a quick and dirty method that I have used to get a "quick" overview of my transcripts:

1) Extract the IDs for the transcripts represented in your heatmap (this is an example where the first column is the ID) and write them into a txt file (in R): IDs <- as.data.frame(df[,1]) write.table(IDs, file="IDs.txt", row.names=FALSE, col.names=FALSE, quote=FALSE, sep",")

2) In the terminal (if you have not already done it), remove the "description" from your fasta file - and keep only the ID's (Trinity IDs in this case) sed -e 's/^\(>[^[:space:]]*\).*/\1/' my.fasta > mymodified.fasta

3) Extract sequences based on the ID's extracted from R:

sudo pip install pyfaidx

xargs faidx input.fasta < IDs.txt > output.fasta

4) Load the output fasta into the free version of Blast2Go: https://www.blast2go.com/

5) Copy paste the Transcript IDs and the annotation to each of them into a data frame (I used excel) and save it as csv (for example annotation.csv)

6) Load the annotation.csv into R and merge it with the data-frame containing your heatmap data (annotations = annotation.csv):

annotation <- merge(annotations, tmp_df, by="target_id")

#Set the target_id's as row names - Description in this case is the annotation description obtained from blast2go
rownames(annotation) <- annotation$Description 

anno<-annotation[,-c(1:2)] # delete column 1 & 2 containing target id's and description not used as row_names

#Heatmap2
## Row clustering (adjust here distance/linkage methods to what you need!)
hp <- hclust(as.dist(1-cor(t(anno), method="pearson")), method="complete")

## Column clustering (adjust here distance/linkage methods to what you need!)
hs <- hclust(as.dist(1-cor(anno, method="pearson")), method="complete")

#Make a 6x8 inch image at 600dpi:
ppi <- 600
png("myheatmap.png", width=10*ppi, height=6*ppi, res=ppi)
heatmap.2(as.matrix(anno), 
          Rowv=as.dendrogram(hp), 
          Colv=as.dendrogram(hs), 
          scale="row", 
          density.info="none", 
          trace="none",
          cexRow=0.7, cexCol = 0.8,
          col=bluered(75),
          margins = c(6,27),
          keysize=1,
          key.par = list(cex=0.7),
          dendrogram="column")
dev.off()

Cheers, B

ADD COMMENT
0
Entering edit mode

Hi BioBing, this looks like a quite complex workflow for what OP wants to achieve. I am not doubting that it works, even though it is not possible to reproduce, but could you explain what it actually is at the core that displays only a selection of gene ids (e.g. setting them to NULL or empty string) and how this can be achieved inside R without using external programs.

ADD REPLY
0
Entering edit mode

Hi Michael,

Yes, I agree - my approach is kinda messy :-) But it was how I got it to work, and I just wanted to share it in a case of it could be helpful in some way.

Ahh! I just realized I misunderstood the question, my fault. I thought the question was about getting annotations into a heatmap of selected genes without a full annotation!

I am not sure how to add in only few gene names to a heatmap in R. Maybe adding on the gene names manually in photoshop or similar?

ADD REPLY
0
Entering edit mode

Thanks BioBing. I think I should have not used the word "annotate". As I understand, this workflow is how to annotate FASTA file. What I want is how to show selected, not all, gene names , all of them are already "annotated", on a heatmap.

ADD REPLY
1
Entering edit mode
7.4 years ago
Michael 55k

The easy way: use Row-labels and set all but those you want to show to empty string:

data(mtcars)
x  <- as.matrix(mtcars)
labRow <- c(row.names(x)[1], rep('', length(row.names(x))-1)) # take just the first name, here you can choose the ones you like
heatmap.2(x, labRow = labRow)

Result: a heatmap with only Mazda RX4 :)

ADD COMMENT
1
Entering edit mode

lastline:

heatmap.2(x, labRow = labRow)
ADD REPLY
0
Entering edit mode

fixed, thank you

ADD REPLY
0
Entering edit mode

Thanks Michael. This is almost I want, and works well in many cases. If possible, I want to draw leading lines with some scripts. Because 1) with many genes, i.e. very narrow rows, it is somewhat difficult to precisely identify, or draw a line by hand to the row which the label annotates. 2) in some cases, selected genes locate closely and the labels overlap.

ADD REPLY
0
Entering edit mode

Sorry, drawing lines is a bit more difficult. How about something like this:

 heatmap.2(x, labRow = labRow, rowsep = c(9:10), sepwidth = c(0.05,0.05), sepcolor = 'blue')

You just have to find your gene of interest in the dendrogram order.

ADD REPLY

Login before adding your answer.

Traffic: 2532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6