Question

Paired DMR methylation caller

0

Entering edit mode

7.0 years ago

German.M.Demidov ★ 3.0k

Hi,

I am looking for a tool that can detect DMRs using the pairing information from normal-tumor samples. Looks like my study is kind of underpowered (because nobody asked me about the study design) and the only way to improve the power is to use all prior information. Do you know such callers? Really do not want to implement it myself...

The size of DMRs is of >=5kb, megabase-scale regions are also expected.

I've tried dmrseq, I like the method, but having 3 samples per group and 3 groups makes permutation approach (which is great in general) really underpowered...even if the regions are super long, their FDR are still higher than any reportable value

bisulfite DMR • 3.9k views

ADD COMMENT • link updated 7.0 years ago by Charles Warden 8.3k • written 7.0 years ago by German.M.Demidov ★ 3.0k

0

Entering edit mode

IMHO it's courteous to explain acronyms. ALso, please keep in mind that we don't know your study.

Looks like my study is kind of underpowered (because nobody asked me about the study design)

MMD :)

ADD REPLY • link 7.0 years ago by Michael 56k

0

Entering edit mode

I mean, if a person does not know what DMR is - he will not be able to help me, no? It is a differentially methylated region. Have no idea what MMD is.

ADD REPLY • link updated 7.0 years ago by Michael 56k • written 7.0 years ago by German.M.Demidov ★ 3.0k

0

Entering edit mode

Made my day :)

ADD REPLY • link 7.0 years ago by Michael 56k

0

Entering edit mode

Besides, this page is not just here for you but a general resource.

ADD REPLY • link 7.0 years ago by Michael 56k

0

Entering edit mode

which is totally true, so a person who is looking for paired DMR methylation caller will be able to find this question and probably get useful information.

ADD REPLY • link 7.0 years ago by German.M.Demidov ★ 3.0k

0

Entering edit mode

That is fine, now, after you added your first comment. Whether there is any useful information remains to be seen. It looks like your experiment suffers from poor design because you don't get any significant results, and you are somehow trying to fix that, is that correct?

ADD REPLY • link 7.0 years ago by Michael 56k

0

Entering edit mode

Absolutely. I can reveal the design, but I am not sure if it will be helpful. The problem with significance is - I use dmrseq and, even if I have 1MB long stretch when the groups are almost perfectly separated with the visible effect, the q-value is like 0.2. This tool uses honest permutation approach, which is totally correct, but in this case parametric approaches would give much smaller p-value. So, summarizing, I am not trying to fish for significant p-values when there is none, I am trying to find approach for better estimation...

ADD REPLY • link 7.0 years ago by German.M.Demidov ★ 3.0k

0

Entering edit mode

Detection of differentially methylated regions (DMRs) within whole genome and targeted NGS data maybe? Just something I found in similar posts.

ADD REPLY • link 7.0 years ago by Michael 56k

0

Entering edit mode

Wow, I've used metilene, but never thought that it can work with more than 2 groups (and it actually is). Thanks! Will try it.

Ah no, looks like "multiple" groups for them means 2 - http://www.bioinf.uni-leipzig.de/Software/metilene/Start/

ADD REPLY • link 7.0 years ago by German.M.Demidov ★ 3.0k

score 2 · Accepted Answer · 2018-11-20

2

Entering edit mode

7.0 years ago

Charles Warden 8.3k

methylKit allows use of a covariate, and tends to be relatively sensitive. If you have pre-defined annotations, you can sum counts within a region for analysis (so, one step for differential methylation at a region level). I would recommend either 1500 or 2000 bp around the transcription start site (TSS).

COHCAP also allows use of a 2nd variable. However, you'll have to create the percent methylation table for BS-Seq data (and I would strongly recommend having some gene annotations, to either use directly or with de novo clustering within annotations). However, default settings for COHCAP are fairly conservative, and you'll probably have to change some parameters (such as the methylation thresholds at both the site and the region step; additionally, you may choose to use more liberal criteria at the 1st site test, whose purpose is to reduce sites to compare for the region test in the 2nd step).

ADD COMMENT • link 7.0 years ago by Charles Warden 8.3k

0

Entering edit mode

Thank you a lot for such a comprehensive answer! I have one question more. I have paired samples, normal and tumor. How would you proceed with such data? Is adding information about pairing as contrast matrix enough, or should I manipulate with the data itself (eg. working with differences in methylation instead of methylation counts from 2 samples and adding some random "zero" samples from permutations)?

ADD REPLY • link 7.0 years ago by German.M.Demidov ★ 3.0k

1

Entering edit mode

I'm not sure if I completely follow you about the permutation and "zero" samples.

However, in terms of using methylKit, you can use calculateDiffMeth(regions, covariates=covariate). Region counts can be defined with a Genomic Ranges object with regions=regionCounts(meth, regions=refGR).

For COHCAP, you'll need to put in a little extra effort upstream (although you can get counts and percent methylation values from Bismark post-alignment processing in both cases). For paired analysis in COHCAP, you'll need 1) a sample description file, with the pairing values in the 3rd column, and 2) set paired=T (or paired="continuous") in COHCAP.site() and COHCAP.avg.by.island(). You might also want to try changing the alt.pvalue parameter, if you have a large number of sites to test.

For COHCAP, there should be a minimum coverage value in the percent methylation table (setting values with less than that level of coverage to NA), where I would recommend using at least 10x coverage. If there are enough missing values, the test will essentially be skipped for that feature (although features with only a couple missing values can usually still be tested). FYI, methyKit also has a minimum coverage parameter (requiring that level of coverage in all samples), so you may want to focus on at least 10x sites for that program as well.

If you need to increase sensitivity, then methylKit may be OK (and, if you have sensitive issues with methylKit, you will probably also have some difficulty defining regions with COHCAP). However, I knew of at least 2 possible solutions, so I thought I should list them.

ADD REPLY • link 7.0 years ago by Charles Warden 8.3k

0

Entering edit mode

thanks a lot! by permutations I means "calculate differences between normal and tumor and premute them across the genome to calculate pvalues for non-permuted data", but I guess it is not required taking into account your comprehensive answer. Thanks a lot, will try thes approaches! COHCAP looks like really promising approach.

ADD REPLY • link 7.0 years ago by German.M.Demidov ★ 3.0k

0

Entering edit mode

That's true - none of the p-values are calculated with permutation in COHCAP.

I'm not sure if I was entirely clear, because I was actually suggesting you try methylKit first. However, I wish you the best of luck, and I hope at least part of this feedback is helpful to you!

ADD REPLY • link 7.0 years ago by Charles Warden 8.3k