Hello,
I am trying to run a differential methylation analysis with DMRcate and I am having some issues. First and foremost, I know that it has been optimised for array data, and I have sequencing data. Nonetheless, I have read the manual and it should technically work, at least for what I want to do.
However, I am having to do some bypasses and I wonder if I might be having issues because of this. This is my code and what I have done at each stage:
I start with a table with M-values for my 24 samples. I have a column with a numeric ID, Chromosome position in format "chrX", start position, end position and strand.
So, the first thing I do is convert my data frame in a GRanges object:
gr <- GRanges(seqnames=Rle(df$Chromosome), ranges=IRanges(start=df$start, end=df$end), strand=Rle(df$Strand))
Then, I associate my M-values with the GRanges objects like so:
m_val_matrix <- as.matrix(df[,-1(1:5)])
rownames(m_val_matrix) <- df$ID
mcols(gr) <- c(mcols(gr), DataFrame(m_val_matrix))
After that, I create a design matrix using my samplesheet, which has the grouping names I want:
design <- model.matrix(~0+factor(samplesheet$BroadGroup))
colnames(design) <- levels(samplesheet$BroadGroup)
I create the lmFit
object using the design matrix and my Mvalues, and make the contrast matrix for my comparison pairs.
fit <- lmFit(m_val_matrix, design)
contMatrix <- makeContrasts(A_VS_C = A - C,
B_VS_C = B - C,
A_VS_B = A - B, levels=fit$design)
Then, because DMRcate requires stats
, fdr
, chr
, pos
and diff
columns, I need to extract stats
, fdr
and diff
using limma:
fit2 <- contrasts.fit(fit, contMatrix)
fit2<-eBayes(fit2)
comparisons <- c("A_VS_C ", "B_VS_C ","A_VS_B ")
comparison_results <- list()
for(comp in comparisons){
comparison_results[[comp]] <- topTable(fit2, coef=comp, adjust.method="BH", sort.by="none", number=Inf)}
comparison_df <-as.data.frame(comparison_results[["A_VS_C"]])
I decided to start with one of my comparisons, just because I wanted to see if dmrcate would work. Because sequencing data does not have an annotation manifesto, I have manually annotated using AnnotationHub:
hub <- AnnotationHub()
cpg_islands <- hub[["AH5086"]]
cpg_met_over <- findOverlaps(gr, cpg_islands)
overlappingCpG <- cpg_islands[subjectHits(cpg_met_over)]
methsiteIDs <- paste(seqnames(gr)[queryIndices],
start(gr)[queryIndices],
end(gr)[queryIndices],
sep=":")
cpgIDs <- paste(seqnames(overlappingCpG)[queryIndices],
start(overlappingCpG)[queryIndices],
end(overlappingCpG)[subjectIndices],
sep=":")
annotMet <- data.frame(Met_siteID=methsiteIDs , CpGID=cpgIDs )
Once I have the summary of the methylation data with the cpg info annotated, I merge cpg information with the methylation gr
object:
gr$metsiteid <- paste0(seqnames(gr),":",start(gr),":",end(gr))
gr$cpgid <- annotMet$CpGID[match(gr$metsiteid , annotMet$Met_siteID)]
And I finalise adding all the columns to the gr
object that I know that are going to be needed:
gr$stat <- comparison_results$A_vs_C$t
gr$diff<- comparison_results$A_vs_C$logFC
gr$fdr<- comparison_results$A_vs_C$adj.P.Val
gr$chr <- df$Chromsome
gr$pos <- df$Start
Once this is done, I run cpg.annotate()
function prior to running dmrcate()
myAnnotation <- cpg.annotate(object=as.data.frame(gr)),
datatype="sequencing",
what="M",
analysis.type="differential",
design=design,
contrasts=TRUE,
cont.matrix=contMatrix,
coef="A - C",
fdr=0.05)
dmrs_overlap <- dmrcate(myAnnotation, lambda=1000, C=2, min.cpgs=3)
My issue is that I have created a myAnnotation
object, but when I try to run dmrcate()
it gives me this error:
Error in dmrcate(myAnnotation, ...): is (object, "CpGannotated") is not TRUE
I cannot see where have I gone wrong, as I technically have the annotations, the columns, etc., that DMRcate needs to run, and I have created a myAnnotation object successfully - which I hoped that would work with dmrcate()
function.
Note that I am using a modern version of DMRcate but I have forced the cpg.annotated()
function from a previous version.
Can someone please lend me a hand on this matter, suggest what may I be doing wrong in this case? Please, refrain from pointing to other methylation packages - I know there are others, but I want to get this one in particular up and running.
Hello, sorry for the delay. Could you show class(myAnnotation) ? I also suggest you to check the development version of DMRcate where
cpg.annotate
is deprecated and replaced bysequencing.annotate
for sequencing data, which should avoid conflicts of versions.