Hi All,
I'm a beginner user in R, so this is probably an easy question for experienced R users to help me with. I appreciate any help that anyone can give.
So I have made a heatmap from microarray data which contains 1106 patient samples. Within these samples I have information on sex (male or female), age (old or young), and stage of cancer (stage I, stage IA, stage IB stage II, stage IIA, stage IIB, stage III, stage IIIA, stage IIIB, or stage IV). I have no problem making the heatmap, but I'm having trouble figuring out how to make a column side color bar which will distinguish these different populations in the patient data. Ultimately I would like to have a stacked color bar for the column which will show the following: 1. first column being color coded for female or male, 2. second column being color coded for young or old, and 3. last column being color coded for the stage of cancer.
In my data set I have changed the column heading to the following format: (sex)_(age)_(stage)
. So a specific example would be the following: F_Y_IB
(indicating Female_Young_stage IB
).
Would someone be able to help me out in figuring out the best way to do this in R?
Thank you in advance for any help that anyone can give me.
Afshin
Thank you for your comment! I'm using the
heatmap.2
function. Would you be able to help me out with incorporating this in aheatmap.2
function. I really appreciate your help. I'm pretty new with R, so anything helps!Actually I went ahead and worked with pheatmap function and this worked really well!!
I really appreciate your help Gian! Thanks again for your response!!
Glad it helped Afshin. Please if this solves your question please consider accepting the answer.
I actually have one more question for you Gian. Now that I have gotten some really nice labels and column colors, I'm trying to figure out why the color in the heatmap from pheatmap is different from when I plotted the heatmap using heatmap.2 function. Here are the two image comparisons:
First image is using the heatmap.2 function and the second image is using pheatmap function.
Here is the R code for the first image:
Here is the R code for the pheatmap method:
Would be able to let me know how I can get the same color scheme for the pheatmap image as I did for the heatmap.2 image?
Thanks!
Afshin
Please always use a reproducible example so other people can run it and help. Try to use the
dummymat
I posted instead ofx
andcolnames(expr_mat)
ok sorry about that.
Here is the version with the
dummymat
data for the two methods:Which produces the following image: https://www.dropbox.com/s/vovo389unfxl3ne/pheatmap%20example%20data.png?dl=0
Here is with
heatmap.2
function:which produces the following image: https://www.dropbox.com/s/j3q7gakljr0qcgq/heatmap2%20example%20data.png?dl=0
These two different methods produce different colors for the heatmap.
Let me know what you think might solve this problem. I would like to use the
pheatmap
function, but with the same color scheme as theheatmap.2
function.Thanks!
Afshin
The colors are different because pheatmap doesn't create symmetric breaks, to achieve that you can specify the breaks (last line):
Thanks that worked great! I really appreciate your help! Thanks again!
I actually have another question dealing with the issue above.
The example with the dummy data is great for showing how to get a color bar above the heatmap.
But how do I specifically match the colors to the heading of my original data?
For example my data header is formatted as "F_O_I".
With the example above I noticed that it does not specifically find my headers which contain "F" and make sure that it associates this with "Females", or specifically associate "O" with "Old", etc... How do I make sure that I can get each color to actually associate with the information in the header of my data set?
Any help will be much appreciated.
Thanks
Hi Afshin,
The idea here is to have two different data frames, one with your heatmap data(ex expression data) and the other with the annotation for the patients. Instead of changing the header of your 'dummymat' you should add a column in 'categories' for the characteristic that you want to plot.
So create the data frame 'categories' starting with your patients names and add the characteristics. This is an example of how you could add the Sex category:
Hi Gian,
Thanks!! That works great when for distinguishing between two groups. If I now want to distinguish between 3 or more groups how would I do that? For example, in the example above "I" is one of 4 groups representing stage. So in my header label I have samples that are labeled by "F_O_I", or "F_O_II" or "F_O_III" or "F_O_IV".
So I can't use the ifelse function. What do you recommend instead for this?? I really appreciate your help and sorry for the simple questions. I'm actually learning a lot from your help.
Thanks again!
Afshin
Hi Gian,
Nevermind... I just figured it out with the following code:
You might have a cleaner suggestion than this...
Thanks again for all your help!
Afshin