Entering edit mode
6 months ago
Researcher
▴
30
Hi all, I have a query. I am trying to perform a correlation analysis between the methylation values and expression values of a gene. I want gene-specific correlation R value, to understand the relationship. Is it ideal to construct my matrix with gene name, methylation value, and expression value and perform correlation using the corr() function in R, or how to construct the correlation matrix?
How many methylation and expression values do you have per gene? If per gene you have a table named - gene_corr; which has two columns - methylation_values and expression_values, then you could indeed just do:
As a sanity check, if the dimension of gene_corr differs per gene, you might want to do resampling to obtain a similar number of observations per gene. For more information on resampling - you could have a look here
there is only one value of methylation and expression per gene. I did perform corr(gene_corr$methylation_values, gene_corr$expression_values) this function. but this would just give the average correlation, and not the gene-specific methylation and expression correlation right?
If your gene_corr data frame contains one gene instead of all genes, then the correlation would be per gene. If your gene_corr data frame contains all genes then you could do something like this:
Now every row (presumingly corresponding to every gene) would have an individual correlation coefficient
Thank you so much. I ran based on the above code and found that all gene columns have the same values, which you obtain as the average correlation.