Methylation-Gene expression correlation
1
0
Entering edit mode
4.3 years ago

Dear all,

I am trying to perform methylation-gene expression correlation analysis in R.

After removing probes not mapped to any gene symbol, around 300000 probes remained.

Considering the large number of samples (280), the correlation analysis generates a huge adjacency matrix that it is impossible to be opened in excel.

Since there are several probes corresponding to each gene in methylation data, I have to perform all against all pairwise correlations and then filter out the results.

I searched for a way to filter out significant negative correlations, however after a couple of hours it is still running. I have already increased the memory of R using memory.limit() function.

Is there any way to do this task in my laptop (with 16G Ram) ? (I do not have any access to computer server right now)

I would appreciate any help

Nazanin

DNA methylation Gene expression Correlation • 1.5k views
ADD COMMENT
1
Entering edit mode
4.2 years ago

Hey again,

Perhaps you could consider performing the analysis as a 'sliding window' across each chromosome? For example, perform the correlation in 5 megabase 'windows', and then move this window by 4 megabase each time.

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin, I could solve the problem in R a few weeks ago, however I could find only one significant negative correlation! It is strange, isn't it?

ADD REPLY
0
Entering edit mode

Yes, it would seem strange, as methylation is supposed to decrease gene expression (?)

ADD REPLY
0
Entering edit mode

I used log transformed form of htseq-count data for gene expression and B values of methylation for corresponding genes for pearson correlation in R. Then I used <= - 0.5 for selecting significant negative correlations. When previously I had compared deregulated expressed genes and demethylated genes, I found a few genes with inverse negative correlation (up-regulated-hypo methylated or down-regulated-hyper methylated), but I could find any of those genes in the correlation analysis!

ADD REPLY
0
Entering edit mode

Maybe encode the methylation as binary (methylated|not methylated) and then do binary logistic regression with each gene?

You normalised the HTseq data, correct?

ADD REPLY
0
Entering edit mode

I am not familiar with this, but try to find a way to do binary logistic regression

ADD REPLY

Login before adding your answer.

Traffic: 1750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6