Entering edit mode
7.1 years ago
mms140130
▴
60
Hi,
I downloaded Methylation data using package TCGA2STAT in R as follows
methyl<- getTCGA(disease="BRCA", data.type="Methylation", type="27K")
it has two outputs the methyl$dat which has the data
head(methyl$dat[,1:3])
TCGA-01-0628-11A-01D-0383-05 TCGA-01-0630-11A-01D-0383-05
cg00000292 0.79940858 0.62039417
cg00002426 0.33900444 0.18030460
cg00003994 0.02811930 0.03607298
cg00005847 0.60116497 0.64955777
cg00006414 NA NA
cg00007981 0.01881682 0.01803597
and the gene annotation methyl$cpgs
Gene_Symbol Chromosome Genomic_Coordinate
cg00000292 ATP2A1 16 28890100
cg00002426 SLMAP 3 57743543
cg00003994 MEOX2 7 15725862
cg00005847 HOXD3 2 177029073
cg00006414 ZNF425;ZNF398 7 148822837
cg00007981 PANX1 11 93862594
the problem is I get the same gene with different methylation values as follows:
A2ML1 0.85332099 0.422268191 0.28015569 0.61441715 0.231997855
A2ML1 0.691462014 0.420426417 0.195839615 0.575344397 0.151897964
A4GALT 0.066524012 0.041965822 0.100817531 0.17217131 0.117686942
A4GALT 0.432681922 0.182219229 0.618835095 0.26247578 0.671077877
A4GNT 0.86171353 0.821129689 0.814334155 0.67838202 0.874795198
and I want to build a multiple linear regression geneexp_i = alpha+ beta1.CNV_i + beta2 METH_i + error , where i represent gene so the gene is counted once no duplicates but in the methylation file it is same gene duplicated with different methylation data
what can I do??
They are targeting different sites, so, will show different levels. What are the actual probe IDs behind the two A2ML1 and A4GALT genes shown in your bottom table?
For The A2ML1 I have cg27653134 and cg03490200 for A4GALT they are cg07393322 and cg09744051