Entering edit mode
6.8 years ago
shinobee
▴
60
I've encounter with some gene ids, not having Entrez ID correspondence and some probe IDs may have same Entrez ID. How do you solve this problem?
biocLite("KEGGdzPathwaysGEO")
#Alzheimer
# Title: Incipient Alzheimer's Disease: Microarray Correlation Analyses
# URL: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1297
# PMIDs: 14769913
# 9 control 7 disease
data(GSE1297)
data <- GSE1297@assayData$exprs
grp <- as.integer(as.factor(GSE1297@phenoData@data$Group)) - 1
library('hgu133a.db') # here use your chip hgu133a.db
g_ids <- mapIds(hgu133a.db, keys=rownames(data), c("ENTREZID"), keytype="PROBEID")
# Many using same entrez id
> length(which(table(g_ids)>1))
[1] 4805
> sort(table(g_ids),decreasing = T)[1:3]
g_ids
10730 3514 3077
19 17 13
> g_ids[which(g_ids == 10730)]
201351_s_at 201352_at 209671_x_at 210972_x_at 211902_x_at 213830_at 215524_x_at 215540_at
"10730" "10730" "10730" "10730" "10730" "10730" "10730" "10730"
215769_at 215796_at 216133_at 216191_s_at 216304_x_at 216540_at 217056_at 217063_x_at
"10730" "10730" "10730" "10730" "10730" "10730" "10730" "10730"
217065_at 217143_s_at 217397_at
"10730" "10730" "10730"
# no mapping to entrez id
lengthwhichis.na(g_ids)))
[1] 1165
See if this helps: How do I map Affymetrix probe IDs to gene symbols in R? (check the linked Blog page therein).