The rownames of demo data in MethylMix package are different from downloaded matrix
2
0
Entering edit mode
6.4 years ago
592348363 • 0

Hi, I download methylation data and gene expression data from TCGA by MethylMix. After Processing,I got the matrix of toSave$METcancer different from METcancer ,a small demo data sets from TCGA of gliobastoma. The rownames of demo data set are gene symbol ,however the rownames of my toSave$METcancer are "gene symbol---Cluster1".

MethylMixResults <- MethylMix(toSave$METcancer, toSave$GEcancer, toSave$METnormal)

Then, by MethylMix,I could not get the correct results.

library(doParallel) 
library(MethylMix)   
cancerSite <- "CHOL"
cancerSite <- "CHOL"
setwd("/home_extend/u1419/ cholangiocarcinoma")
targetDirectory <- paste0(getwd(), "/")
cl <- makeCluster(5)
registerDoParallel(cl)
#  # Downloading methylation data
METdirectories <- Download_DNAmethylation(cancerSite, targetDirectory)
#  # Processing methylation data
METProcessedData <- Preprocess_DNAmethylation(cancerSite, METdirectories)
#  # Saving methylation processed data
saveRDS(METProcessedData, file = paste0(targetDirectory, "MET_", cancerSite, "_Processed.rds"))
#  # Downloading gene expression data
GEdirectories <- Download_GeneExpression(cancerSite, targetDirectory)
#  # Processing gene expression data
GEProcessedData <- Preprocess_GeneExpression(cancerSite, GEdirectories)
#  # Saving gene expression processed data
saveRDS(GEProcessedData, file = paste0(targetDirectory, "GE_", cancerSite, "_Processed.rds"))
METProcessedData <- readRDS(paste0(targetDirectory, "MET_", cancerSite, "_Processed.rds"))
res <- ClusterProbes(METProcessedData[[1]], METProcessedData[[2]])
toSave <- list(METcancer = res[[1]], METnormal = res[[2]], GEcancer = GEProcessedData[[1]], 
                      GEnormal = GEProcessedData[[2]], ProbeMapping = res$ProbeMapping)
head(toSave$METcancer)[,1:5]
                 TCGA-3X-AAV9 TCGA-3X-AAVA TCGA-3X-AAVB TCGA-3X-AAVC TCGA-3X-AAVE
A1BG---Cluster1     0.5256615    0.4655586    0.5651652    0.4859289    0.6592752
A1BG---Cluster2     0.1692925    0.5620878    0.4360619    0.0345946    0.1778778
A1CF---Cluster1     0.7549405    0.5252172    0.5765410    0.3852516    0.5877026
A1CF---Cluster2     0.5291152    0.8943637    0.3479129    0.6873616    0.4107920
A1CF---Cluster3     0.6235495    0.9174264    0.7459019    0.7928292    0.6816474
A2BP1---Cluster1    0.7586256    0.6098513    0.7930285    0.7406754    0.7633507
MethylMixResults <- MethylMix(toSave$METcancer, toSave$GEcancer, toSave$METnormal)`

I appreciate if you share your comment with me.

Best Regards,

Ganxun Li

R • 1.7k views
ADD COMMENT
1
Entering edit mode
6.4 years ago
pbpanigrahi ▴ 430

Hi

You are mapping the methylation probes to gene, for which you are using the step

res <- ClusterProbes(METProcessedData[[1]], METProcessedData[[2]])

This function uses the annotation for Illumina methylation arrays to map each probe to a gene. Then, for each gene, it clusters all its CpG sites using hierchical clustering and Pearson correlation as distance and complete linkage.

So for A1BG gene, two possible clusters of methylation probes (whose intensity/methylation level pattern is found similar), are clustered together. Since Cluster1 and cluster2 are different w.r.t distance and intensity/methylation level pattern, you can't club them together and assign into A1BG i.e. you can't always get single gene single methylation level.

In the demo dataset, probably for one gene you get one cluster, so no change in gene names.

Hope it helps.

ADD COMMENT
0
Entering edit mode

Hi pbpanigrahi

Thanks for the response. How can I get the methylation level of one gene, such as A1BG, by calculating the average of the methylation level of different clusters of A1BG ?

Best,

Ganxun Li

ADD REPLY
0
Entering edit mode
6.4 years ago
592348363 • 0

Hi pbpanigrahi

Thanks for the response. How can I get the methylation level of one gene, such as A1BG, by calculating the average of the methylation level of different clusters of A1BG ?

Best, Ganxun Li

ADD COMMENT
0
Entering edit mode

Taking average is dangerous since different clusters can have different methylation level. Did you see what results you are getting in MethylMixResults object. Can you post the object

str(MethylMixResults);
# if it is a data frame then
head(MethylMixResults)

Ultimately the methylmix will give you features that are differentiating between tumor and normal. So based on the feature name, you can see which cluster is important to you. E.g it may happen that A1CF gene, cluster2 is important feature. So you can simply ignore cluster1 and cluster3.

What is the objective of your study

ADD REPLY

Login before adding your answer.

Traffic: 1362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6