I have cross-posted this question on Stack Bioinformatics. I'll put an update if the question is answered there.
I have downloaded a miRNA expression dataset from NCBI GEO (GSE25631)
to study differential gene expression and perform other analyses. As mentioned in GEO, this profiling was performed on GPL8179 Illumina Human v2 MicroRNA expression beadchip
. Accordingly, I installed the illuminaHumanv2.db
package to annotate the Illumina probe IDs. However, when I tried to convert the Illumina probe IDs
to their corresponding gene symbols
, it returned NA
for all the probes. Can anybody identify the issue with this approach? Here's the R script which I used.
library(GEOquery)
library(limma)
library(illuminaHumanv2.db)
Sys.setenv(VROOM_CONNECTION_SIZE = 256000)
data <- getGEO(GEO = "GSE25631",
destdir = "E:\\GSE25631",
GSEMatrix = TRUE,
AnnotGPL = FALSE,
getGPL = FALSE,
parseCharacteristics = TRUE)
data <- data$GSE25631_series_matrix.txt.gz
raw_intensity <- exprs(data)
samples <- as.character(pData(data)[,"title"])
raw_intensity_log <- log2(raw_intensity)
colnames(raw_intensity_log) <- c(rep("GBM", 82), rep("Normal", 5))
probe_id <- rownames(raw_intensity_log)
mapping <- data.frame(Gene = unlist(mget(x = probe_id, envir = illuminaHumanv2SYMBOL, ifnotfound = NA)))
head(mapping, 10)
Gene
ILMN_3166935 NA
ILMN_3166938 NA
ILMN_3166940 NA
ILMN_3166941 NA
ILMN_3166943 NA
ILMN_3166944 NA
ILMN_3166945 NA
ILMN_3166948 NA
ILMN_3166952 NA
ILMN_3166953 NA
The GSE25631_RAW.tar
file in the supplementary section has all the Probe ID to Gene mappings but I'm curious to know why this approach didn't work.