Question

A workflow for identification of differentially methylated regions, starting with a data frame of beta values?

0

Entering edit mode

7.3 years ago

c.ryder3 ▴ 40

Hello! I have a data frame in R that contains 450K methylation beta values for 6 samples. The probe IDs are the row names and the sample names are the column names. It looks like this:

> head(ICGC_2)
             naive.1   memoryCS.1   naive.2    memoryCS.2  naive.3    memoryCS.3
cg00000029  0.6199970   0.5703951  0.6383819   0.5831206  0.7012571  0.6000816
cg00000108  0.9083578   0.9105157  0.9030611   0.9103147  0.9115842  0.8947593
cg00000109  0.8694214   0.7525098  0.8478160   0.7725212  0.8645145  0.7636347
cg00000165  0.1911901   0.3050081  0.1810569   0.3750369  0.2250429  0.3094155
cg00000236  0.8666489   0.8382011  0.8586420   0.8369283  0.8860430  0.8439371
cg00000289  0.6653662   0.5512665  0.5815338   0.4773868  0.6254710  0.5408634

I would like to compare the naive samples to the memoryCS samples to identify genomic regions that are differentially methylated in the naive samples vs the memoryCS samples. Can anyone suggest a workflow that will allow me to do this, with this data as the starting point? I'm aware of the DMRcate package, which includes the dmrcate function for identifying differentially methylated regions (DMRs), but this function requires an annotation object generated by cpg.annotate. CpG.annotate requires a matrix of M values, which I believe I can generate from my data frame, but it also requires a study design matrix, which I don't know how to generate. Can anyone offer me some guidance?

Thank you!

R bioconductor 450K methylation DMRcate • 2.3k views

ADD COMMENT • link updated 7.3 years ago by halo22 ▴ 300 • written 7.3 years ago by c.ryder3 ▴ 40

0

Entering edit mode

minfi,champ and RnBeads are some suggestions

ADD REPLY • link 7.3 years ago by halo22 ▴ 300

score 0 · Answer 1 · 2017-08-18

You can try minfi (https://www.bioconductor.org/help/course-materials/2014/BioC2014/minfi_BioC2014.pdf) it is a good workflow. But again for DMR's even minfi would require you to define your design matrix. Honestly, I would advice spending sometime studying the design matrix. The design matrix is essential since it guides the comparisons(naive samples vs memoryCS) by fitting an appropriate model. Try the following or spend time learning about limma. http://bioinf.wehi.edu.au/marray/ibc2004/lab3/lab3.html#EstrogenDesignMatrix