How to produce a heatmap by keeping the original data and a log color scale?
2
0
Entering edit mode
3.6 years ago

Hello, I have a matrix of the number of significant SNPs for different traits from different genomes. I want to show this information in the heatmap. So, I used the datamatrix and plotted the attached heatmap using the following code:

pheatmap(data_frame,cluster_rows = F, scale = "row",color = (colorRampPalette(c("navy", "white", "firebrick3"))(20),cluster_cols = F, fontsize_number = 8))

However, this does not show the actual information, rather it scales the number of SNPs and somehow over-represent the data. If I use scale="none" then I get a very bad heatmap with very little differentiation according to the color. The problem is that I have big differences in my data. The number of SNPs is ranged from 1 to 180.

heatmap with pheatmap in R

Is there any way to plot a heatmap by keeping the original data and use a log color scale to better visualize the data?

Please let me know.

R heatmap ggplot2 visualization • 3.7k views
ADD COMMENT
0
Entering edit mode

Do you want to use ggplot2 or not? You say you have " big differences in my data" but the plot without scaling doesn't show them. If you use dendrogram to reorder the samples and genomes perhaps this will be seen. Otherwise perhaps heatmaps is not the right visualization method to show your differences or there might be an error or mistake on the method used to detect those differences.

ADD REPLY
0
Entering edit mode

Yes, I have used the pheatmap package in R. So, for some some traits, I have SNP number from 1-10 and for one trait I have the number from 170-180. If I scale it, then it does not show the real differentiation among the number of SNPs, does it?

ADD REPLY
2
Entering edit mode
3.6 years ago
Lemire ▴ 940

Use the color and breaks arguments of pheatmap:

 ?pheatmap

 [...]

 color: vector of colors used in heatmap.

 breaks: a sequence of numbers that covers the range of values in mat
          and is one element longer than color vector. Used for mapping
          values to colors. Useful, if needed to map certain values to
          certain colors, to certain values. If value is NA then the
          breaks are calculated automatically. When breaks do not cover
          the range of values, then any value larger than 'max(breaks)'
          will have the largest color and any value lower than'
          min(breaks)' will get the lowest color.

Example:

m<- matrix( 1:100, ncol=10, byrow=T )
rownames(m)<-1:10
colnames(m)<-1:10
col<- colorRampPalette(c("navy", "white", "firebrick3"))(4)
pheatmap( m, col=col)
pheatmap( m, col=col, breaks=c( 1,10,20,40,100) )
# note the dendrograms are the same
ADD COMMENT
0
Entering edit mode

But it does not show the differentiation correctly. For example, all the values from 50-100 are assigned to a single color. What I am trying is to show the variation in the number for a particular trait on a particular genome that your above code is not showing. Or may be I am missing something?

ADD REPLY
0
Entering edit mode
3.6 years ago

Anyways, the problem is solved by using breaks. Thank you.

ADD COMMENT
0
Entering edit mode

Please do not post your comments as answer. In addition, if a member's post resolved your query, accept that as a answer to close the post and help future visitors.

ADD REPLY
0
Entering edit mode

Ok, sorry for this. It will not happen again.

ADD REPLY
0
Entering edit mode

Thanks for understanding the forum etiquette. Good day.

ADD REPLY

Login before adding your answer.

Traffic: 1283 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6