I am working with illumina EPIC beadchip v1 and v2 data. I see that the v2 data comes with another IlmnID than the v1 data, and includes the cg name e.g. cg and a number, however in v2 it also has an appendix after the cg number looking like this _BC13. And I also see that the same probe id can have several entries like this
cg22051776_TC12 cg22051776_TC13 cg22051776_TC14
That correspond to the same place in the genome as far as I have understood, but the probes are a bit different. How to deal with this? Take the mean of all? Use only one? Analyze them as separate probes (don't think this is the best idea, if they indeed interrogate the same CpG site)?
After filtering, I have 4767 of these, and cannot go through them all to see which ones to keep and which ones to discard.
Hi Christine! I am dealing with the same issue. In my case I identified them because I needed the IDs of the probes to merge the EPIC v2 data with EPIC v1 data, so after removing the _BC13 part, they appear duplicated (some of them more than twice). At first I thought about removing one of the positions randomly, but I realized that they don't have exactly the same beta/M value so I discarded this option. Were you able to solve this somehow? Thank you in advance!
Hi, by any chance did you manage to solve this problem?