getMappedEntrezIDs from missmethyl package R
1
0
Entering edit mode
7 weeks ago
Rut_MDLV • 0

Hi,

I am using missMethyl package in R to analyse DNA methylation data on a complex disease. I have corrected batch effect using RUVm two step inverse function and now I am doing the enrichment. I am trying to annotate significant CpGs to the corresponding genes through the entrezID (getMappedEntrezIDs function). I wanted to know if there is a way of knowing the CpGs mapping to each entrezID out of the significant CpGs found which I have in a list (sigCpGs).

output <- getMappedEntrezIDs(sig.cpg = sigCpGs, all.cpg = all_CpGs, array.type = "EPIC", anno = annEPIC)

In the output one can find output$freq that gives a table output with numbers of probes associated with each gene, but I believe it gives the total number of probes (all_CpGs) associated. How would you do it to give the number and names of sigCpGs?

Thanks

R missmethyl CpG methylation gene • 386 views
ADD COMMENT
1
Entering edit mode
6 weeks ago

Dunno about missMethyl but if you just want to map your list of cgs to Entrez then here you go:

sigCpGs ## some random list of cpgs

library(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)

## get the annotation 
sig.ann <- getAnnotation(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)[sigCpGs, ]
## get the assocaited gene names, collapse them
sig.genes <- sig.ann$UCSC_RefGene_Name
sig.genes <- lapply(strsplit(sig.genes, split = ";"), unique)
sig.genes <- lapply(sig.genes, paste, collapse = ";")

## if gene name is missing, get the nearest gene 
library(FDb.InfiniumMethylation.hg19)
nearest.gene <- getNearestGene(probes = GRanges(seqnames = sig.ann$chr, 
                                                ranges = IRanges(start = sig.ann$pos, 
                                                                 end = sig.ann$pos, 
                                                                 names = sig.ann$Name)))

mapped.df <- data.frame("cg" = sigCpGs,
           "symbol" = unlist(sig.genes),
           "nearest.gene" = nearest.gene$nearestGeneSymbol,
           "nearest.gene.dist" = nearest.gene$distance,
           "island.relation" = sig.ann$Relation_to_Island)

library(org.Hs.eg.db)
library(AnnotationDbi)
## get entrez 
mapped.df$entrez <- mapIds(org.Hs.eg.db, keys = mapped.df$nearest.gene,
                    column = "ENTREZID", keytype = "SYMBOL")
ADD COMMENT
0
Entering edit mode

How can it be that the output from mapping sigCpGs with

library(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)

sig.ann <- getAnnotation(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)[sigCpGs, ]
sig.genes <- sig.ann$UCSC_RefGene_Name

is different from

library(missMethyl)
library (biomaRt)

annEPIC <- getAnnotation(IlluminaHumanMethylationEPICanno.ilm10b4.hg19)
sig.entrez <- getMappedEntrezIDs(sig.cpg = sigCpGs, all.cpg = all_CpGs, array.type = "EPIC", anno = annEPIC)
sig.entrez <- sig.entrez$sig.eg

# Connect to Ensembl
ensembl <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl")

gene_info <- getBM(attributes = c("entrezgene_id", "external_gene_name", "description"),
                                filters = "entrezgene_id",
                                values = sig.entrez,
                                mart = ensembl)

They seem to be using the same reference library to annotate genes, so what could be the reason?

Maybe biomaRt uses another library?

ADD REPLY

Login before adding your answer.

Traffic: 2592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6