Hi, guys,
I have a set of methylation data from 450K platform. And I want to test see how it compares to another set, which unfortunatly is from RRBS. I have the bed format file of the RRBS, which gives the CpG sites and methylation level. I am wondering how to combine 450K and RRBS, especially they don't have same CpG site.
Thanks.
Thanks for the info.
It is true that 450K give less information than BS-Seq, such as RRBS. Actually I am only interested in those sites detected by 450K, as my sample is performed on this platform. The problem is my reference samples are performed on RRBS platform. I checked the CpG sites provided in these two samples, one is based on probe coordinates and another one is from the BED file. They don't match each other. So I am wondering what I should do?
Thanks.
You'll have to compare the files with a custom script, but the necessary information (chromsome, position, and beta / percentage methylation) are available in both cases. I believe the 450k annotation file (the .bpm file) provides both hg18 and hg19 coordinates (where the hg19 information is in CHR and MAPINFO). The 450k beta values are between 0 and 1 and the percentage methylation values in the .bed files (I assume from Bismark) will probably be between 0 and 100, so you need to change the scale by a factor of 100 (unless you are looking at differential methylation - then the chromosome and position are all you need to make something like the venn diagram in the figure that I was mentioning)
You can download that 450k annotation file from the Illumina website, and there is also a copy in the standalone version of COHCAP
Thank you very much for your help. Actually it is exact what I am asking. The coordinates from 450K (MAPINFO) are different from the coordinates provided in BBRS. How could I merge them? Or they are surely different CpG sites, meaning only the overlapped CpG sites can be extracted and compared (venn diagram)?
Thanks again.
Yes - you should only look for the overlap. The RRBS may just happen to not show enrichment in some areas covered by the 450k array, so this is my guess. You can visualize your alignment for a few cases to confirm this is in fact what is happening.
Great. Thank you very much for the help.