This is my sample DF of big data matrix & and each column has named with multiple information and separated by an underscore.
I want to follow a Tukey Test and plot bar charts for each Gene (Response vs. Time; filled by both the genotypes) with multiple comparisons letters.
Can someone able to help me with adding significance letters using multcomp and multcompView packages.
structure(list(Gene = c("AGI4120.1_UBQ", "AGI570.1_Acin"), WT_Tissue_0T_1 = c(0.886461437, 1.093164915), WT_Tissue_0T_2 = c(1.075140682, 1.229862834), WT_Tissue_0T_3 = c(0.632903012, 1.094003128), WT_Tissue_1T_1 = c(0.883151274, 1.26322126), WT_Tissue_1T_2 = c(1.005627276, 0.962729188), WT_Tissue_1T_3 = c(0.87123469, 0.968078993), WT_Tissue_3T_1 = c(0.723601456, 0.633890322), WT_Tissue_3T_2 = c(0.392585237, 0.534819363), WT_Tissue_3T_3 = c(0.640185369, 1.021934772), WT_Tissue_5T_1 = c(0.720291294, 0.589244505), WT_Tissue_5T_2 = c(0.362131744, 0.475251717), WT_Tissue_5T_3 = c(0.549486925, 0.618177919), mut1_Tissue_0T_1 = c(1.464415756, 1.130533457), mut1_Tissue_0T_2 = c(1.01489573, 1.114915728), mut1_Tissue_0T_3 = c(1.171797418, 1.399956009), mut1_Tissue_1T_1 = c(0.927507448, 1.231911575), mut1_Tissue_1T_2 = c(1.089705396, 1.256782289 ), mut1_Tissue_1T_3 = c(0.993048659, 0.999044465), mut1_Tissue_3T_1 = c(1.000993049, 1.103486794), mut1_Tissue_3T_2 = c(1.062562066, 0.883617224 ), mut1_Tissue_3T_3 = c(1.037404833, 0.851875438), mut1_Tissue_5T_1 = c(0.730883813, 0.437440083), mut1_Tissue_5T_2 = c(0.480635551, 0.298762126 ), mut1_Tissue_5T_3 = c(0.85468388, 0.614923997)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list( cols = list(Gene = structure(list(), class = c("collector_character", "collector")), WT_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), WT_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_0T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_1T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_3T_3 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_1 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_2 = structure(list(), class = c("collector_double", "collector")), mut1_Tissue_5T_3 = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))
My codes:
df1 <- df %>% gather(var, response, WT_Tissue_0T_1:mut1_Tissue_5T_3) %>% separate(var, c("Genotype", "Tissue", "Time"), sep = "_") %>% arrange(desc(Gene))
df2 <- df1 %>% group_by(`Gene`,Genotype,Tissue,Time) %>% mutate(Response=mean(response),n=n(),se=sd(response)/sqrt(n))
I want to perform PH Tukey, and I used:
library(car)
library(lsmeans)
library(multcompView)
fit1 <- aov(Response ~ Genotype*Time, df1)
summary(fit1)
lsmeans(fit1, pairwise ~ Genotype | Time)
How can I add significance letters to bar chart using, multcomp and multcompView packages.
This is my codes for bar charts;.
df2$genotype <- factor(df2$genotype, levels = c("WT","mut1")) colours <- c('#336600','#ffcc00')
library(ggplot2)
ggplot(df2,aes( x=Time, y=Response, fill=Genotype))+ geom_bar(stat='identity', position='dodge')+scale_fill_manual(values=colours)+ geom_errorbar(aes(ymin=average_measure-se, ymax=average_measure+se)+ facet_wrap(~`Gene`)+ labs(x='Time', y='Response')
Finally, I want to denote significance difference letters in this graph, at each time point as I get from lsmeans(fit1, pairwise ~ Genotype | Time)
Expected Graph:
I would appreciate your kind help, if possible.
What are the error messages ? Code produces messages on errors so that you can get info on the problem. Without reporting the messages, it's unlikely someone will be able to help. I may also be helpful to show an example of the data.
Hi Heriche, I've modified the question now.
Thank you for the support. I've been fixing most of the errors since morning, now. Would you be able to help me with the adding significance letters to the graph? I can not figure out. would be greatly appreciated.
I am not sure what you mean by significance letters. Stastistical significance is sometimes indicated by stars or by writing the p-value of the test above the bars. If this is what you're after, have a look at the ggpubr package.
Not necessarily, library(multcompView) have an option to give letters instead of stars. Unfortunately, I'm not able to write the syntax by combining Tukey output to plot syntax
The ggpubr package has a
stat_compare_means
function, which may add significance markers (brackets, p-values or asterisks / letters) to the plot. What I didn't like is it only works when performing the test, it doesn't let the user provide the significance table.There is also a ggsignif package, but I never used it.
ggpubr uses ggsignif which I also haven't used. A maybe less fancy way of putting statistical significance above bars could be to create a data frame with the positions of the labels and use it like this:
You mean something like this?
Yes, these are just done with
geom_text()
andgeom_segment()
:Then, you can also add labels, like this:
Thanks Kevin,
I am looking for bar plot / box plot with letters
Importantly, generating these letters from Tukey test and adding to plots is the main issue for me, want to know how to manage this kind of df for this work!