Geo Data Matrix Using R
2
1
Entering edit mode
13.0 years ago
Yogesh Pandit ▴ 520

I want something like,

ID   GB_ACC   Gene Title   Gene Symbol   ENTREZ_GENE_ID   GO_BP   GO_MF   GO_CC   Sample1   Sample2......

along with the corresponding values/IDs in the rows for a specific GSE. Following code gives me on the values for samples. How can I add rest of the information to the data matrix in R

library(GEOquery)
gse <- getGEO(filename = file)
probesets <- Table(GPLList(gse)[[1]])$ID
data.matrix <- do.call("cbind", lapply(GSMList(gse),
    function(x) {
        tab <- Table(x)
        mymatch <- match(probesets, tab$ID_REF)
        return(tab$VALUE[mymatch])
    }
                                       ))

And how do I restrict it to a few of my selected samples. Thanks in advance

geo r • 8.4k views
ADD COMMENT
2
Entering edit mode
13.0 years ago
Stephen 2.8k

Use the GSEMatrix argument:

mygeomat <- getGEO("GSE12345", GSEMatrix=TRUE)

See page 13 of the docs.

ADD COMMENT
0
Entering edit mode

My words exactly.... Just a note that GSEMatrix=TRUE has been the default for almost two years.

ADD REPLY
0
Entering edit mode

"mygeomat" in the answer above will be a list of ExpressionSet objects. Take a look at the Biobase vignette about ExpressionSets for details of subsetting and getting annotation information out.

ADD REPLY
0
Entering edit mode
13.0 years ago

Via Simon Cockell's Twitter feed, I learned about GEO2R, a web app to analyze gene expression in GEO datasets using R. While this won't address your specific question, it may offer an alternate way to get the result you seek.

ADD COMMENT

Login before adding your answer.

Traffic: 1545 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6