Entering edit mode
4.9 years ago
pablojosegiraudi
•
0
Hello!,
I am working with a GEO series matrix file (ID > GSE48452, Platform GPL11532) that corresponds to HuGene-1_1st Affymetrix Human Gene 1.1 ST array. I want to have the probes with the annotations, for example: Gene Symbol in order to create a data table like the following:
Sample1 Sample2 Sample3 Sample4 Sampl5
#CLASS:CANCER case case case case
#CLASS:SEX F F M M F M F M
Gene Symbol
Gene1 -3.06 -2.25 -1.15 -6.64 0.4
Gene2 -1.36 -0.67 -0.17 -0.97 -2.0
Gene3 1.61 -0.27 0.71 -0.62 0.14
Gene4 0.93 1.29 -0.23 -0.74 -2
How can map the probes with the gene symbol mantaining the order?
I was using the following Rscript without sucess, I don't know how to proceed........
getGEOdataObjects <- function(x, getGSEobject=FALSE){
# Make sure the GEOquery package is installed
require("GEOquery")
# Use the getGEO() function to download the GEO data for the id stored in x
GSEDATA <- getGEO(x, GSEMatrix=T, AnnotGPL=FALSE)
# Inspect the object by printing a summary of the expression values for the first 2 columns
print(summary(exprs(GSEDATA[[1]])[, 1:2]))
# Get the eset object
eset <- GSEDATA[[1]]
# Save the objects generated for future use in the current working directory
save(GSEDATA, eset, file=paste(x, ".RData", sep=""))
# check whether we want to return the list object we downloaded on GEO or
# just the eset object with the getGSEobject argument
if(getGSEobject) return(GSEDATA) else return(eset)
}
# Store the dataset ids in a vector GEO_DATASETS just in case you want to loop through several GEO ids
GEO_DATASETS <- c("GSE48452")
# Use the function we created to return the eset object
eset <- getGEOdataObjects(GEO_DATASETS[1])
# Inspect the eset object to get the annotation GPL id
eset
# Get the annotation GPL id (see Annotation: GPL10558)
gpl <- getGEO('GPL11532', destdir=".")
Meta(gpl)$title
# Inspect the table of the gpl annotation object
colnames(Table(gpl))
# Get the gene symbol and entrez ids to be used for annotations
Table(gpl)[1:10, c(1, 2, 6, 12)]
dim(Table(gpl))
# Get the gene expression data for all the probes with a gene symbol
geneProbes <- which(!is.na(Table(gpl)$Symbol))
probeids <- as.character(Table(gpl)$ID[geneProbes])
probes <- intersect(probeids, rownames(exprs(eset)))
length(probes)
geneMatrix <- exprs(eset)[probes, ]
inds <- which(Table(gpl)$ID %in% probes)
# Check you get the same probes
head(probes)
head(as.character(Table(gpl)$ID[inds]))
# Create the expression matrix with gene ids
geneMatTable <- cbind(geneMatrix, Table(gpl)[inds, c(1, 2, 6, 12)])
head(geneMatTable)
# Save a copy of the expression matrix as a csv file
write.csv(geneMatTable, paste(GEO_DATASETS[1], "_DataMatrix.csv", sep=""), row.names=T)
Thank you in advance for your help!!!!
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!