Hello,
I am converting the location of CpGs from the 450k methylation array to hg38 coordinates. The Illumina Infinium Methylation450K manifest (HumanMethylation450_15017482_v1-2.csv) has coordinates for versions 36 and 37 of the human genome.
My question is about the strand of the CpGs. The manifest has a strand column which can be F or R. I believe this is the CpG strand relative to the genome. However, since the annotation seems to have been done for version 36 of the genome, I was wondering whether it is correct to assume that the strand will be the same for hg38? It is possible that some fragments have been inverted in subsequent assemblies, so the answer is probably no. In that case, how would you convert the CpG strand to ensure you have the correct one for hg38?
Thank you for reading!
Best wishes,
Sophie.
Do you need the strand information, though? Once you liftOver the positions, if you find a G on the ref sequence at that position then it means the probe was targeting the C on the other strand (-). Only cytosines can be methylated.