I have a sheet containing numeric pathway scores and non-numeric Mutation status (e.g., MUT and WT) of the gene. I wish to compute the Spearman correlation coefficient and p-values of each pathway (numeric data) with each of the gene mutation status. Which package in R should I prefer to choose? I tried using cor package in R, but it gives error as it imports only numeric data type.
I have a data like this and i used cor function to compute Sperman Rank correlation coeffient. It's giving values but i'm not able to understand these values are with respect to what (MUT or WT) as its giving a single value of each gene column with respect to each pathway.
AGTR2(Gene) Alanine.and.aspartate.metabolism
MUT 0
MUT 0.041389321
WT 0.016554228
WT 0.172155284
I used this script for my analysis
Col_data <- read.table(file = "Data.txt", header = T, sep = "\t")
A <- cor(rank(Col_data$AGTR2), rank(Col_data$Alanine.and.aspartate.metabolism))
B <- cor(rank(Col_data$AGTR2), rank(Col_data$Amino.and.nucleotide.sugar.metabolism))
and so on...
For Spearman correlation, you need the values being correlated to be rank-able. WT/MUT is not a rank-able variable. I'm not sure if you can "correlate" to nominal binary variables.
You can just check if the distribution of mutation values in WT are significantly different than the MUT using Wilcoxon test which gives a p-value.