collapse multiple CpG site for a gene
0
0
Entering edit mode
9.5 years ago
Ezhil La ▴ 40

Hi,

I am working on 450K BeadChip methylation arrays. There are multiple CpG sites for a gene and I would like to know better ways of collapsing multiple probes into a single one representing a gene. Could you please suggest some methods and also the software for doing this step?

In gene expression arrays, I normally select a probe with high variance (or average some times) to represent a gene.

Thanks in advance.

Kind regards,

Ezhil

genome • 2.7k views
ADD COMMENT
0
Entering edit mode

A gene-level expression estimate makes some sense, a gene-level methylation metric less so. A better question to ask yourself is whether you really do want to summarize over whole genes (hint: you probably don't).

ADD REPLY
0
Entering edit mode

Probably not but I am not sure that it is a correct way. I thought of averaging all probes within 200KB of transcription start site (TSS) to represent a gene-level methylation. Obviously 200KB is arbitrary and also the assumption of something close to TSS is very important than other gene-regions made me to look for alternate ways.

ADD REPLY
0
Entering edit mode

Why not to:

for each gene select probe with highest variance

ADD REPLY
0
Entering edit mode

Here's what I do, in a nutshell: for each methylation site, link it to it's nearest neighbouring gene and to it's 2 nearest methylation sites in a cytoscape network; import the site-specific methylation p-values, run jActiveModules. However, I disagree that a gene-level summary of the methylation data is of no biological varlue (and I certainly wouldn't cherry pick the highest variance probe, no reason to introduce a bias for no reason)

ADD REPLY

Login before adding your answer.

Traffic: 1996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6