MeDip-seq data interpretation + analysis / MEDIPS, MACS2
1
0
Entering edit mode
10.2 years ago

Hello everybody,

Since three weeks I try to make sense out of my (h)medip-seq data (I am an absolute beginner in deep sequencing data analysis and bioinformatics in general) and I hope you guys can give me some tips.

Sample: in vivo material of three linage related cell types

Method: comperative (h) MeDip-seq ( Li Tan et al. 2013) with 300 ng of each of the three celltypes (in total 900ng DNA) using the Diagnode MeDip and hMeDip kits.

First of all I don't know if my data look alright since I have really narrow peaks (reads are extended to 250 bp) which align quite often only to on strand (hemimethylated DNA or non-CpG methylation), (http://i.imgur.com/1TiUPTQ.png). I tried already to download published raw medip-seq data to compare them with my data but without success. Most times the data were already processed and therefor i couldnt compare them with my data. Is this narrow peak pattern normal or should I see more broader peaks (MeDip QC and Sequencing QC seem all okay)?

The second problem is to analyse the data and find differential methylated peaks/regions). I used already the MEDIPS package (see code below) but these gives me around 10 peaks without duplicates whereas I get around 500 peaks with duplicates (adj=T).

`library("MEDIPS") library("BSgenome.Mmusculus.UCSC.mm10") library("edgeR")

bcv <- 0.01 counts <- matrix( rnbinom(40,size=1/bcv^2,mu=10), 20,2) y <- DGEList(counts=counts, group=1:2) et <- exactTest(y, dispersion=bcv^2) medipoN="/home/crtd_c...MeDip_Output.bam" medipoR="/home/crtd_c ...2MeDip_Output.bam" BSgenome="BSgenome.Mmusculus.UCSC.mm10" uniq=FALSE extend=100 shift=0 ws=100 chr.select=NULL

N_Output= MEDIPS.createSet(file = medipoN, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select) R_Output= MEDIPS.createSet(file = medipoR, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq, window_size = ws, chr.select = chr.select)

mr.edgeR.MeDip = MEDIPS.meth(MSet1 = N_Output, MSet2 = R_Output, p.adj = "hochberg",diff.method = "edgeR", prob.method = "poisson", MeDIP = F, CNV = F, type = "rpkm", minRowSum = 1)

mr.edgeR.Medip.s = MEDIPS.selectSig(results = mr.edgeR.MeDip, p.value = 0.2, adj = T, ratio = NULL, bg.counts = 30, CNV = F)


However if I visualize the called regions in seqmonk they don't look really interesting whereas I find easily other regions which look more differential methylated. So I don't really trust the package but may used some wrong settings? Can I use adj=F (but i think then i will get to much false positive peaks) ?

I proceed then with MACS2 to call broad peaks in all three samples, merged them and count the reads of the peaks using bedtools.

`macs2 callpeak -t /home/crtd_c ... MeDip_Output.bam -c /home/crtd_c ... MeDip_Input.bam --broad -f BAM -g mm --bw 250 -n PP_MeDip_MACS2` -> for each of the three celltypes

merge all three with "cat" function

bedtools merge -i /home/crtd_csortedmacs2peaks.bed -n -scores sum& -d 100 > mergedMACS2peaks.bed bedtools multicov -bams /home/crtd_c...PP_MeDip_Output.bam /home/crtd_c...DP_MeDip_Output.bam /home/crtd_c...N_MeDip_Output.bam -bed /home/crtd_c...mergedMACS2peaks.bed > MeDippeaks.bed -D -q 30 ```

I sorted all peaks out which have no more than 20 reads in one of the three samples ( sorted <-subset(mydata, ((V6>20)|(V7>20)|(V8>20))))and calculated the log2 ratio between the three celltypes (so at the end I had a excelsheet with the genomic coordinates and the three log2 ratios). However now is the question can I do it like this and if yes, how can I define the significant changes of the log2 ratios (simply +1/-1 fold change ?).

However I am happy about any other suggestions regarding the data analysis. May there is a tool out there which I didn't try so far.

Thanks a lot!

ChIP-Seq • 8.1k views
ADD COMMENT
1
Entering edit mode
10.2 years ago
Ming Tommy Tang ★ 4.5k

I have post related to MeDIP-seq

You may find it helpful.

ADD COMMENT
0
Entering edit mode

Thanks a lot, I will try diffreps as well as the MACS2 bdgcmp option.

ADD REPLY
0
Entering edit mode

Hi Florian,

How was your experience with the MACS2 bdgcmp option? I was trying to compare a Tumor and Normal sample ( both having CONTROL Sample) and after running MACS2 bdgcmp option, I am only getting 1-2 DMR peaks ? Is that anything obvious that I am missing?

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1772 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6