Dear all,
I have methylation data via the Illumina Epic Beadchip and gene expression data via RNA sequencing.
The methylation data is annotated with the manifest file and annotation file from Illumina. I would like to use the gene symbol from UCSC genome browser, as UCSC also provides information on 'relationship to CpG island' and 'gene region'.
The gene expression data is annotated with Ensembl 75 for hg19.
My goal is to know if changes in methylation in a certain gene change its expression. But since both datasets are annotated with different genome browsers, I was wondering if I can match both datasets?
Thank you in advance,
Sara
Indeed, but be aware that there are minor differences in chromosomal coordinates between Ensembl and UCSC (e.g.: if someone uses Ensembl genome + annotation, this will be 1bp off from UCSC genome + UCSC annotation. Importantly, when I mention UCSC annotation, this goes for all the annotation UCSC hosts including Ensembl, Gencode etc...).
Good point, I had forgotten this.
Thank you both for the quick reply.
I noticed that gene symbols are mostly the same between Ensembl and UCSC, but there are a few differences. Is this because of the difference in chromosomal coordinates?
This could be due to UCSC and Ensembl using a different version of the HGNC data. In your case, I would just work with Ensembl 75 gene IDs and get the gene symbols from Ensembl 75 when needed.