Entering edit mode
6.0 years ago
vinayjrao
▴
260
Hello,
I want to look at the methylation pattern of the promoter around a certain gene in different cancers; TCGA data is obtained from cBioPortal.
I want to know if the data (beta values) can be plotted directly, or do I have to normalize it by any way first.
Thanks in advance.
Thanks a lot, Kevin. Can I directly correlate the methylation status with expression levels?
For the same, gene, I am looking at the expression level and methylation status. So, if the RSEM value is for example, around 13000 (median), in breast carcinoma, and 20000 for glioblastoma. Corresponding methylation status for these would be as you mentioned from 0 to 1. Can these data be directly correlated (will the methylation status be higher in breast cancer, or do I need to process the files in any way to correlate)?
Thanks again.
Yes, you can correlate anything to anything, but a correlation will not reveal the underlying mechanism at play. I would not worry too much about the differences in scale between the expression data and the methylation data. Just ensure that the expression datasets are normalised in the same way. If using RSEM, it may help to log these in order to bring the distribution to a normal distribution.
There are some previous threads:
Slightly related but similar idea:
Another possibility, which is preferable, is to build linear regression models between each gene's expression and the corresponding methylation probes mapping to it (one-to-many). From these models, you can easily derive both a p-value and a r-squared value, along with many other things.
Thanks a lot, Kevin. You have been extremely helpful.
I have just another question. Is it acceptable to correlate expression and methylation pattern by plotting a PCC? If yes, should it be done by adjusting the RSEM counts from 0 to 1, or should it be done with the raw values?