Correlation between gene expression and methylation
1
1
Entering edit mode
6.4 years ago
tujuchuanli ▴ 130

I have a list of genes and want to test whether the expression level of genes in this list could correlate with DNA methylation level. I verify my hypothesis in TCGA breast cancer. below is my plan

planA:

  1. Extract the expression and methylation level for each gene in my list. Expression can be defined as RPKM from RNA-seq data and methylation level from probe in the promoter region of this gene (from -3kb to 500bp around TSS. if there are multple probes in this region, I prefer to average these probe values as final methylation level value for this gene).
  2. calculating the correlation between these two data (eg. pearson correlation coefficient). if the P-value is significant I can say that there is a significant correlation between these two data.

planB:

  1. Calculating Z score of gene expression for each gene (z score as (value - mean normal)/SD normal).
  2. Calculating Z score of methylation level for each gene (z score as (value - mean normal)/SD normal). from -3kb to 500bp around TSS. if there are multple probes in this region, I prefer to average these probe values. then to calculate Z score.
  3. calculate the correlation coefficient just as metioned above.

which could be better? if you have suggestions please tell me.

Thanks

tcga gene expression DNA methylation • 4.1k views
ADD COMMENT
1
Entering edit mode

Standardizing the data (i.e. z-score transformation) is a linear transformation and Pearson's correlation is unaffected by linear transformation of the variables so you'll get the same result whether using the raw data or the standardized one.

ADD REPLY
0
Entering edit mode

Do you really need this on a global level or would per-gene comparisons work? That'd be much more meaningful.

ADD REPLY
0
Entering edit mode

Hi Devon,

I am looking at a similar exploration to tujuchuanli's. Would you mind explaining what you meant by a per-gene comparisons/what would that look like?

ADD REPLY
1
Entering edit mode

It's more likely that there's a coherent relationship between methylation and gene expression if one looks at individual genes than globally, since they relationship (think slope) won't be the same between genes and you'll probably be left with a big blob of dots and no way to coherently fit things.

ADD REPLY
0
Entering edit mode

Yes, I need this. What I talking about is that the expression level of genes in my list could be controlled by DNA methylation level. This is only way as far as I know (I know it from reading papers. it can be viewed in scatter plot) If you know a better way, please tell me. Thanks

ADD REPLY
0
Entering edit mode
6.4 years ago
pbpanigrahi ▴ 430

As stated by Jean, correlation is independent of scales. Whether you do normalization or any kind of transformation to data, correlation will be same.

One suggestion I can give you that, instead of simply averaging out intensities of all methylation probes for a given gene (i.e. one gene one methylation level), you can cluster probes based on distance and intensity values (methylmix package has probe clustering function ClusterProbes) so you can have more than one clusters of probes per gene. So in case you wont find correlation wr.t. one cluster, you may see correlation w.r.t other cluster.

Explore this thread

ADD COMMENT

Login before adding your answer.

Traffic: 2117 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6