I've recently started working with the Illumina 450K methylation platform. There are several software packages available to handle this data including methylumi, lumi, minfi (those 3 from Bioconductor) and IMA. I'm disregarding IMA since it requires text files exported from BeadStudio in a particular format (which I don't have) and I prefer to start from IDAT files.
The packages are similar in that they create an R object based on the eSet class, but they all come with different methods for adjusting colour bias and normalizing. I'm finding the number of choices rather confusing. For example:
- methylumi has a rather basic method, normalizeMethyLumiSet(), which does not seem entirely appropriate for the 450K platform
- lumi has methods for colour bias correction, background adjustment and normalization; it's not clear to me whether these methods should be applied separately to the type I and type II probes on the 450K platform (and if so, whether I'd then somehow recombine the data)
- minfi makes no mention of colour bias but has a method in the development version, preprocessSWAN(), which does normalization accounting for differences in type I/type II probes
So my questions are:
- Which package do you use? Or do you use more than one, in combination?
- Should I even worry about colour bias adjustment? And if so, should I treat type I and II probes differently? And if so, how?
- The "best" method, in your opinion, to normalize? Using lumi - ssn or quantile? Or use minfi? Treat colours separately or not? Treat type I/II probes separately or not?
My current feeling is that preprocessSWAN() in the minfi development version is the way to go, but I'd appreciate your thoughts (and especially, your R code).
Hah at the moment I'm not paid to worry - there are always improvements to be made but a 0.85/0.95 correlation is good enough for me until someone does some serious benchmarking.
wow! that' great correlation. Is that the norm for 450k?
Worst I've gotten between 450k and bisulfite seq is 0.85 and that was comparing primary cells (grown for a short time in culture) isolated between two different patients ie patient 1 on 450k, patient 2 on bis-seq.
This code is straight from the minfi user guide. I tend to agree though, that it is as good as anything. You don't worry about color bias, treating type I/II probes separately or the SWAN method?
Hello! I came across this old post while searching for methylation data analysis. I have data from control samples: one unmethylated and one methylated from both bisulfite sequencing and 450K. ( Ideally unmetylated control samples should have 0% methylation and methylated sample should have 100% but this is certainly not the case) I tried to correlate the results between 450K and sequencing, only including the sites that are present in both 450K and sequencing. I use the percentage of methylation ( beta value in 450K). I did not use any of the above package but got the data straight from Genome Studio.
I got a ~0.88 correlation for the unmethylated control sample, but only 0.07 for the methylated control. Any idea how this could be? Thanks in advance!