TCGA Methylation Data and Gene Mapping
1
0
Entering edit mode
20 months ago
James ▴ 30

I am looking into the TCGA Methylation data and I wanted to understand how to parse the data, and, ideally, map measured beta values to single Hugo symbols.

My issues are as follows:

1) For some of the Stable Entity IDs there are multiple gene names listed, for example in the breast cancer (BRCA) data there is a row with values:

Stable Entity ID | Name | Description | Transcript ID

"cg00008493 | KIAA1409;COX8C | Body;5'UTR | NM_020818;NM_182971 |

2) Many Stable Entity IDs map to the same gene, for example, in the attached image, multiple Stable Entity IDs map to the same gene (DLX5) DLX5

For a research project I'd love to associate each gene to a specific methylation value. Put differently, for each patient I want to create a vector where each entry corresponds to a methylation value for a given gene. Is there a principled way to do this?

Methylation Cancer TCGA • 703 views
ADD COMMENT
3
Entering edit mode
20 months ago
Basti ★ 2.0k

CpGs may be annotated to more than >1 gene simply because gene regions overlap on the genome.

If you want to associate each gene to a methylation value, you could take the average methylation of all CpGs for each gene. I am personally not convinced it would be a useful information because not all CpGs have a functional implication across a single gene and most of them are stable between individuals, and you will likely obtain the same mean % of methylation for all individuals.

ADD COMMENT

Login before adding your answer.

Traffic: 2164 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6