Hi,
I'm studying about DNA methylation.
I draw heatmap with beta-value, but there are batch effects..
How can I remove these cgID with studio R?
Hi,
I'm studying about DNA methylation.
I draw heatmap with beta-value, but there are batch effects..
How can I remove these cgID with studio R?
Can you explain what do you mean by 'beta-value'?
Generally batch effect could be removed by several filtering method, one of them is to filter through binomial test. Each CG called for methylation is based on read number and binomial could filter the probability of error of miscalled reads which support the methylation. In R you can use:
binom.test(x, n, p = 0.5, alternative = c("two.sided", "less", "greater"), conf.level = 0.95)
Where n = your total reads mapped at particular CG and x = reads supporting methylated reads.
Before this binomial step, you can also filter each CG with a minimum number of reads mapped.
Also, before plotting such a huge heatmap I suggest you to plot the dendrogram tree and PCA on the sample/replicate basis. This could tell you how samples are associated or apart from each other. All these steps can be done in R or RStudio.
You can also follow R package MethylKit which has a dedicated step to remove experimental artifacts https://bioconductor.org/packages/release/bioc/vignettes/methylKit/inst/doc/methylKit.html#34_Batch_effects
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
In this case I think the OP probably has array data that's why he talks about beta-value... also looking at the (almost impossible to see) last rowname from the heatmap they seem to be the cgXXXXXX identifiers from the illumina array!