How to export subset of metadata and expression data from BioConductor GEOquery?
1
3
Entering edit mode
10.2 years ago
William ★ 5.3k

I am planning to use Bioconductor GEOquery to download a couple of micro-array datasets from NCBI GEO.

Then I would like to export a subset of the metadata and the expression data to flat files that I can import elsewhere.

What I have so far is:

library(GEOquery)
library("R.utils")

geo_id <- "GSE45016"
gse <- getGEO(geo_id,GSEMatrix=FALSE)

#show metadata
Meta(gse)

#show metadata for first sample
GSMList(gse)[[1]]

#select specific field from metadata of first sample
GSMList(gse)[[1]]@header$characteristics_ch1

# Result for sample 1
[1] "tissue: normal prostate (NP) epithelial cells"

GSMList(gse)[[2]]@header$characteristics_ch1

# Result for sample 2
[1] "tissue: prostate cancer cells"   "clinical stage: clinical T4N0M1"
[3] "gleason score: GS 9"             "psa level: PSA 5477ng/ml"

As you can see the number of key value pairs is different for sample 1 and 2. What is would like to have is an array for every key under

@header$characteristics_ch1

and then the value or null (in case the key is missing) for every sample in the GEO dataset" ;

key_tissue: normal prostate (NP) epithelial cells\tprostate cancer cells
key_psa_level: null\tPSA 5477ng/ml

Other metadata fields like "title" luckily only have a single value beneath it.

GSMList(gse)[[1]]@header$title = "Normal prostate"
GSMList(gse)[[2]]@header$title = "High-grade PC1"

Also these I would like to have in an array for the key title.

My second question is how to export the expressions data that is stored under every sample. I would like to stream trough all the probes, get the expression values for that probe for each sample and write it to another csv file.

R GEO bioconductor • 9.9k views
ADD COMMENT
12
Entering edit mode
10.2 years ago
Neilfws 49k

I think that the way you have chosen to read the GSE data into R has created some confusion for you.

Try this instead (note: formatting was lost here so posted as a Gist):

As for exporting the expression data:

exp <- exprs(gse)

returns a matrix where the column names are sample names.

ADD COMMENT
1
Entering edit mode

Hi Neilfws, How did you write this reply? first it is in the gitbub and second how to prepare them in gitbub? thanks.

ADD REPLY
0
Entering edit mode

Nicely done.

ADD REPLY

Login before adding your answer.

Traffic: 1659 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6