illumina EPIC array support by R packages?
2
2
Entering edit mode
8.5 years ago
weixiaokuan ▴ 140

Hi, Does anyone know which is suitable R tools to perform analysis on illumina EPIC array? I tried to use minfi, but it seems I kept getting bugs for not being able to use function such as "read.metharray", or "read.metharray.exp".

> ?read.metharray
No documentation for ‘read.metharray’ in specified packages and libraries:
you could try ‘??read.metharray’

> sessionInfo()
R version 3.2.5 (2016-04-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] minfi_1.14.0         bumphunter_1.8.0     locfit_1.5-9.1       iterators_1.0.8      foreach_1.4.3        Biostrings_2.36.4    XVector_0.8.0       
 [8] GenomicRanges_1.20.8 GenomeInfoDb_1.4.3   IRanges_2.2.9        S4Vectors_0.6.6      lattice_0.20-33      Biobase_2.28.0       BiocGenerics_0.14.0 
[15] BiocInstaller_1.18.5

loaded via a namespace (and not attached):
 [1] genefilter_1.50.0       splines_3.2.5           beanplot_1.2            rtracklayer_1.28.10     GenomicFeatures_1.20.6  XML_3.98-1.4           
 [7] survival_2.39-4         DBI_0.4-1               BiocParallel_1.2.22     RColorBrewer_1.1-2      registry_0.3            rngtools_1.2.4         
[13] lambda.r_1.1.7          doRNG_1.6               matrixStats_0.50.2      plyr_1.8.4              pkgmaker_0.22           stringr_1.0.0          
[19] zlibbioc_1.14.0         futile.logger_1.4.1     codetools_0.2-14        biomaRt_2.24.1          AnnotationDbi_1.30.1    illuminaio_0.10.0      
[25] preprocessCore_1.30.0   Rcpp_0.12.5             xtable_1.8-2            openssl_0.9.4           limma_3.24.15           base64_2.0             
[31] annotate_1.46.1         Rsamtools_1.20.5        digest_0.6.9            stringi_1.1.1           nor1mix_1.2-1           grid_3.2.5             
[37] GEOquery_2.34.0         quadprog_1.5-5          tools_3.2.5             bitops_1.0-6            magrittr_1.5            siggenes_1.42.0        
[43] RCurl_1.95-4.8          RSQLite_1.0.0           futile.options_1.0.0    MASS_7.3-45             Matrix_1.2-6            reshape_0.8.5          
[49] mclust_5.2              GenomicAlignments_1.4.2 multtest_2.24.0         nlme_3.1-128

Is its implementation completed? why did I get such errors? These commands are all listed in reference manual. By the way, is there other tools to analyze EPIC data? Thank you. -Xiaokuan

methylation minfi champ rnbeads • 9.9k views
ADD COMMENT
1
Entering edit mode

Minfi is currently 1.18, might be worth updating that for a start, along with your version of R to 3.3.

ADD REPLY
0
Entering edit mode

Thank you Andrew. Do you happen to know where I can find the most recent vignette for minfi, especially for analysis of EPIC array? So far almost all the documents are about 450K not EPIC 850K. and the minfi's user guide on bioconductor site is empty too. Thank you. -Xiaokuan

ADD REPLY
0
Entering edit mode

On the bioconductor page, however it seems as though the vignette hasn't built properly. I've cross posted this to Bioconductor support, where hopefully Kasper Hansen, or one of the Minfi team will notice.

ADD REPLY
0
Entering edit mode

Based on Kasper's response, here's 3.2's user guide.

ADD REPLY
0
Entering edit mode

Andrew, thank you for helping me find the solution.Too bad, it has no specific description for EPIC analysis. But I guess I can figure it out. Thank you for your help. -Xiaokuan

ADD REPLY
1
Entering edit mode

Forget about the fact that they're "EPIC" arrays. Everything on there about 450K analysis should work fine for 850K, except for the annotation.

ADD REPLY
0
Entering edit mode

I have roughly finished analysis of EPIC array using "minfi" package. I have obtain the DMP results, however, I am wondering how I should annotate these methylation locus? Simply speaking how I should map a gene to a DMP locus? Thanks.

ADD REPLY
0
Entering edit mode

Hello Wei,

Even I'm working on EPIC array data. Could you please tell me how to go forward for the analysis using minfi package? Thank you in Advance.

ADD REPLY
0
Entering edit mode

dsp

It's pretty straightforward. You just need to follow the tutorial Andrew listed above. If there is any now packages (such as annotation) needed for the process, just install them. Then you will be able to obtain beta, M values and DMPs, DMRs. Also, I have to say minfi so far has the best support for EPIC as I evaluate other packages as well. -Xiaokuan

ADD REPLY
0
Entering edit mode

Thank you very much. I'l give a try then. If I have any doubts will ask you. :)

ADD REPLY
0
Entering edit mode

Hi Wei,

How can I get annotation for EPIC data?

ADD REPLY
1
Entering edit mode

or, after you created genomic ratioSet, gset, using minfi, you can directly get annotation by command: anno<-getAnnotation(gset)

ADD REPLY
1
Entering edit mode

Be careful that it's pulling in the correct annotation for 850, and not just the 450 annotation set.

ADD REPLY
0
Entering edit mode

Thanks for the reply Andrew and Wei. I am working on Infinium methylation EPIC data to find DMP and DMR. I am following minfi user guide for the analysis. I have some doubts in the analysis. Please check the following:

table(targets$Sample_Group)

       A_BaP            A_BPA         A_BPABaP             A_NT
           3                3                3                3
     AT1_BaP          AT1_BPA       AT1_BPABaP           AT1_NT
           3                3                3                3
       CA_NT        fourAfive         MCFseven oneGthree_DEfour
           3                2                2                2
  twoCtwo_DEfour       twoFone_wt
           2                2

For DMP:

dmp <- dmpFinder(bVals, pheno=targets$Sample_Group, type="continuous")
head(dmp)
            intercept       beta         t         pval         qval
cg22291627 0.03929582  0.8203689  247.6537 6.546336e-41 5.635203e-35
cg21359254 0.04262945  0.2735875  173.6932 2.278083e-37 9.805074e-32
cg11905617 0.31248646 -0.2829384 -150.5233 6.116164e-36 1.754968e-30
cg01269456 0.11017638  0.2731397  140.8292 2.823188e-35 6.075628e-30
cg20379012 0.88108641 -0.8007840 -126.7986 3.145894e-34 5.416085e-29
cg24417337 0.49293479 -0.4340655 -111.4600 6.076018e-33 8.717243e-28

What if I want DMP for specific comparisons? And dmpFinder returns a data.frame with columns for the statistic, the p-value, the q-value and the intercept. And, if I would like to know which CpG's are hyper- or hypo-methylated from that table how to go for that? How can I extract the tables of differentially expressed CpGs for each comparison I give?

For eg: I want to find DMP for following comparisons.

MCFseven vs fourAfive
twoFone_wt vs fourAfive
oneGthree_DEfour vs fourAfive
twoCtwo_DEfour vs fourAfive

Looking forward to your response. Thank you in Advance

ADD REPLY
1
Entering edit mode

actually, you can extract the beta values and use phenotype information to create your design file; then using limma package to identify deferentially methylated position. This is pretty much what ChAMP package's champ.MVP function does. As for Andrew's comment, you should be able to automatically load EPIC manifest but double check to see if your annotation is right may be a good point too. -Xiaokuan

ADD REPLY
0
Entering edit mode

Thanks for the reply wei. As you said I used limma package, lmFit. Could you please check the following whether it is the rightway or not. And Could you please tell me about contMatrix once which is below.

table(targets$Sample_Group)

       A_BaP            A_BPA         A_BPABaP             A_NT
           3                3                3                3
     AT1_BaP          AT1_BPA       AT1_BPABaP           AT1_NT
           3                3                3                3
       CA_NT        fourAfive         MCFseven oneGthree_DEfour
           3                2                2                2
  twoCtwo_DEfour       twoFone_wt
           2                2

cellType <- factor(targets$Sample_Group)

design <- model.matrix(~0+cellType, data=targets)
colnames(design) <- c(levels(cellType))
fit <- lmFit(mVals, design)

contMatrix <- makeContrasts(MCFseven-fourAfive, twoFone_wt-fourAfive, oneGthree_DEfour-fourAfive, twoCtwo_DEfour-fourAfive, levels=design)

fit2 <- contrasts.fit(fit, contMatrix)
fit2 <- eBayes(fit2)
summary(decideTests(fit2))
   MCFseven - fourAfive twoFone_wt - fourAfive oneGthree_DEfour - fourAfive
-1               187857                  75140                        82980
0                332649                 518059                       521797
1                340312                 267619                       256041
   twoCtwo_DEfour - fourAfive
-1                      87335
0                      515763
1                      257720

annEPICSub <- annEPIC[match(rownames(mVals),annEPIC$Name),c(1:4,12:19,24:ncol(annEPIC))]

DMPs <- topTable(fit2, num=Inf, coef=1, genelist=annEPICSub)
ADD REPLY
1
Entering edit mode

yes, it seems fine.

ADD REPLY
0
Entering edit mode

See here

ADD REPLY
0
Entering edit mode

Hi Andrew,

The 3.2's user guide doesn't have how to find differential methylated regions. Could you please tell me which package should be used for that and how to? Thank you in Advance.

ADD REPLY
1
Entering edit mode

If you're using Limma to do differential methylation tests, I'd recommend DMRCate

ADD REPLY
0
Entering edit mode

Will give a try. Thank you.

ADD REPLY
0
Entering edit mode

Hi Andrew,

I am having some issues while using DMRcate. I am getting an error for cpg.annotate

myAnnotation <- cpg.annotate(mVals, datatype = "array",analysis.type="differential", design=design,contrasts = TRUE, cont.matrix = contMatrix,coef="MCFseven - fourAfive")

Your contrast returned 496947 individually significant probes. We recommend the default setting of pcutoff in dmrcate(). Error in data.frame(ID = rownames(object), stat = stat, CHR = RSanno$chr, : arguments imply differing number of rows: 812896, 425236

PS. I also made a post at https://groups.google.com/d/msg/epigenomicsforum/G3yO92pEqzQ/z-KIDzRMBwAJ

Please have a look to my pipeline. Any advice is greatly appreciated. Thank you in Advance

ADD REPLY
0
Entering edit mode

This thread is getting ridiculously long now, so please make a new post in biostars

ADD REPLY
0
Entering edit mode

Ok I made a new thread now.

ADD REPLY
2
Entering edit mode
5.4 years ago
Shicheng Guo ★ 9.5k

Try this EPIC annotation:

BiocManager::install("IlluminaHumanMethylationEPICanno.ilm10b4.hg19")
ADD COMMENT
0
Entering edit mode
5.4 years ago

I actually sometimes find GenomeStudio to be preferable to minfi, in terms of having detection p-values to potentially filter out additional CpG sites.

That said, I've also gotten minfi to work with EPIC arrays. Perhaps other recommendations to update your version of R can help (and it looks like you have already gotten to the downstream analysis step?).

While I have to be more cautious about support of GitHub templates, I see some code where it looks like read.metharray() should have worked with the EPIC array.

ADD COMMENT

Login before adding your answer.

Traffic: 2193 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6