I am currently trying to write code in R to analyse the relationship between DNA methylation values at probe sites and gene expression levels. I have DNA methylation data generated from the Illumina Epic array for 50 patients. I also have these patients normalized RNA-seq gene expression levels in VST form. I have 2 dataframes:
DF 1 contains the beta values for all the methylation probes for all 50 patients. Patients are columns and probes are rows
DF 2 contains the VST data for all 50 patients - patients are columns and genes are rows
I would like to test the correlation (just using spearmans rank) between each individual probe methylation value and each individual gene's expression level for all the patients.
I hope to produce a matrix of R values for each probe/gene combination which I can then use to plot either a volcano plot or MA plot.
I am very new to R and therefore I don't know how to code something like this. The reason I am doing every single correlation is because I am hoping that this catch all approach might provide previously unknown relationships.
Hi thank you very much for your quick response! I have tried out this code but unfortunately keep getting the error:
Error in cor.test.default(tmp$meth, tmp$exp, method = "spearman") : 'y' must be a numeric vector
The tmp dataframe looks like this:
res <- cor.test(tmp$meth, tmp$exp, method = "spearman")
Therefore doesn't work because tmp$exp isn't one column but many columns. The tmp dataframe has pulled the gene expression levels for every sample for this particularly gene and put them into a separate column instead of putting each value into a seperate row.
What should I do in this instance?
Oh I see, to make the code work, I modified that a bit. So there is no need to make
tmp
data frame anymore and also I addedas.numeric()
wherepr
andge
are defined. Not having this numeric conversion was the reason that you got the error"Error in cor.test.default(tmp$meth, tmp$exp, method = "spearman") : 'y' must be a numeric vector"
This time I used simulated data framesDF1
andDF2
to replicate your scenario. See the updated reply.If you find my reply helpful please upvote and if that solves your problem, please accept it.