I want to take two data files formatted like this (they have the same column names and numbers of rows):
Basal Part of Pons,Ventral Thalamus,Cingulate gyrus,Insula,Epithalamus,Subthalamus,Brain,Occipital Lobe,Sulci & Spaces,White Matter,parahippocampal gyrus,hippocampal formation,Amygdala,Temporal Lobe,Dorsal Thalamus,Striatum,Hypothalamus,Pontine Tegmentum,Claustrum,Parietal Lobe,Myelencephalon,Basal Forebrain,Cerebellar Cortex,Frontal Lobe,Mesencephalon,Cerebellar Nuclei,Globus Pallidus
-0.5105,0.6625,0.0766,0.0524,0.9915,0.893,-0.3535,-0.0379,-3.2603,-1.4482,0.2761,-1.0088,0.6528,-0.0975,0.9191,0.3016,0.5294,1.0021,0.3865,0.2589,-0.6714,0.781,-1.7963,-0.0369,0.9029,-0.2741,1.6183
-0.1625,0.5075,0.1778,-0.0632,0.8538,0.2927,0.5709,0.196,-2.6836,-1.257,0.4551,-0.8525,0.6108,0.0671,1.0538,0.4313,0.0279,1.1387,-0.096,0.3523,-0.234,0.274,-1.369,-0.2009,0.9535,-0.5611,2.0123
-0.3898,0.1021,0.0307,-0.0371,0.9827,0.6934,-0.338,0.0955,-3.0394,-1.2211,0.2884,-0.8627,0.6386,0.0049,0.8265,0.2864,0.6961,1.3572,0.0489,0.3126,-0.3914,0.6972,-1.6167,-0.1053,0.7518,-0.3766,1.6506
0.1088,0.0494,0.1181,-0.0748,-0.3126,0.4604,-0.0889,0.1592,-4.3136,-0.9078,0.3811,0.7862,1.2036,0.0158,0.9225,0.2069,-0.2604,0.9254,-0.0636,-0.0582,-0.9273,1.0495,-1.9163,0.1125,0.9812,0.0498,1.5284
Basal Part of Pons,Ventral Thalamus,Cingulate gyrus,Insula,Epithalamus,Subthalamus,Brain,Occipital Lobe,Sulci & Spaces,White Matter,parahippocampal gyrus,hippocampal formation,Amygdala,Temporal Lobe,Dorsal Thalamus,Striatum,Hypothalamus,Pontine Tegmentum,Claustrum,Parietal Lobe,Myelencephalon,Basal Forebrain,Cerebellar Cortex,Frontal Lobe,Mesencephalon,Cerebellar Nuclei,Globus Pallidus
-0.2775,0.797,0.0568,-0.1309,1.6462,1.2986,1.0686,-0.1036,0.4154,-1.2079,0.0856,1.2651,0.7292,-0.3591,0.1812,0.6074,2.3187,0.2195,-0.169,0.0673,-0.1764,0.0646,-1.654,0.2921,0.0124,-0.2394,0.8549
0.1197,1.1071,-0.0443,-0.4927,1.2885,1.2452,0.4405,-0.7088,0.521,-0.8329,-0.0298,0.8744,0.5725,-0.3593,0.3811,0.3541,2.1534,0.3472,0.0872,-0.3771,0.157,0.1501,-1.5639,-0.2189,0.0707,0.2236,0.7358
-0.2681,0.8626,0.0542,-0.2743,1.1994,1.4469,0.2456,-0.4497,0.6358,-1.5276,-0.1589,1.426,0.3706,-0.2407,0.4211,0.3877,1.6355,0.1553,-0.1514,-0.1327,-0.0361,0.2013,-1.6199,-0.3371,0.2642,0.0503,0.5735
0.2719,0.1608,0.045,-0.3166,2.5749,0.7839,0.8872,-0.221,0.0046,-0.5548,-0.0274,-0.0425,0.8472,-0.1472,0.4912,0.3156,2.3168,-0.4321,-0.6155,-0.3571,2.8559,0.6661,-1.9877,-0.347,1.4582,-0.1525,0.6299
And generate a heatmap based on the p-values of the
m1 = read.csv("m1file.csv")
m2 = read.csv("m2file.csv")
test.result = mapply(t.test, m1, m2)
p.values = stack(mapply(function(x, y) t.test(x,y)$p.value, m1, m2))
matrix.m1 = as.matrix(m1)
matrix.m2 = as.matrix(m2)
fun = Vectorize(function(i,j) t.test(matrix.m1[,i],matrix.m2[,j])$p.value)
res = outer(1:27,1:27,FUN = "fun")
image(1:27,1:27,res,axes=FALSE,xlab="m1",ylab="m2")
axis(1, at = 1:27,labels=colnames(m1))
axis(2, at = 1:27,labels=colnames(m2))
This gives me one solution, but not quite what I'm looking for. I try this:
colnames(res) <- colnames(m1)
res <-as.data.frame(res)
res$group <- colnames(m1)
library(reshape2)
res <- melt(res,id="group")
library(ggplot2)
p <- ggplot(res, aes(x=group, y=variable)) +
geom_tile(aes(fill = value), colour = "yellow") +
scale_fill_gradient(low = "yellow", high = "red", name="p-value") +
geom_text(aes(label=format(value,digits=2))) +
labs(x="m1",y="m2")
print(p)
And it's satisfactory with the exception of not having having a title or the x-axis labels oriented vertically. Also, showing the p-values explicitly turns out to be too messy. So I update to the following:
colnames(res) <- colnames(maoa)
res <-as.data.frame(res)
res$group <- colnames(maoa)
library(reshape2)
res <- melt(res,id="group")
library(ggplot2)
p <- ggplot(res, aes(x=group, y=variable)) +
geom_tile(aes(fill = value), colour = "yellow") +
scale_fill_gradient(low = "yellow", high = "red", name="p-value") +
ggtitle("title") +
theme(axis.text.x=element_text(angle=90) +
labs(x="m1",y="m2")
print(p)
But I get an error saying:
Error: unexpected symbol in: " print"
I don't think I'm changing the spacing here when I add the new commands, so I'm confused as to why this is happening. Also, if I change nothing from the first ggplot2 code provided here except omit the line geom_text(aes(label=format(value,digits=2))) +
, I get an error saying Error: Discrete value supplied to continuous scale
. Help?
can you provide another input file, so we can test your code and see how to answer you? Preferably, can you provide both as comma-separated files?
I have added input files. And yeah, ordinarily I wouldn't call a function "fun", it's just a placeholder in this case.