Gene expression data from GEO
1
0
Entering edit mode
6.2 years ago
Natasha ▴ 40

This is a follow-up to my post over here

I'm collecting the gene expression data from microarray studies. How do we filter the studies performed using a specific platform, e.g. Affymetrix Human Genome U133 Plus 2.0 Array?

Update: How do we filter the data from GSE files? I found a tutorial that shows how to parse from GSE files. But what I could get is the probe ids and the expression values for each sample. For instance, from GDS one could use Table(gds)/Column(gds) to filter the gene synmbols and the sample description. For GSE, it is mentioned that object class is not available. Could someone help me in filtering the gene symbols and sample description from the expression set that is created using GSE data?

gene GEO microarray • 2.6k views
ADD COMMENT
2
Entering edit mode
6.2 years ago

Hey Natasha. I'm terribly sorry that I missed your previous comments in the other question to which you link. I may have been traveling (In fact, I believe that I was indeed abroad on another continent).

In GEO, if you wish to search for diabetes and pancreatic beta cells in Homo sapiens, then the following search terms would work:

diabetes[All Fields] AND pancreatic beta cells[All Fields] AND "Homo sapiens"[porgn:__txid9606]

For those performed with Affymetrix Human GEnome U133 Plus 2.0 Array, you could try this:

GPL570 AND diabetes[All Fields] AND pancreatic beta cells[All Fields] AND "Homo sapiens"[porgn:__txid9606]

GPL570 is the code for this array. I see 11 studies matching these 4 search terms on GEO.

---------------------------------------

For other ways to search, see here: Querying GEO DataSets and GEO Profiles

To see all of the different platform codes, take a look here: https://www.ncbi.nlm.nih.gov/geo/browse/?view=platforms&tool=findplatform

Kevin

ADD COMMENT
0
Entering edit mode

Hello Kevin, Thanks a lot for the response. I'm parsing the expression values from GDS files in Bioconductor following the tutorial given here. Most of the studies from the above search have reported GSE files. Could you please suggest whether there are tutorials on how to parse data from GSE files in Bioconductor?

Edit: I found a tutorial that shows how to parse from GSE files. But what I could get is the probe ids and the expression values for each sample. For instance from GDS one could use Table(gds)/Column(gds) to filter the gene synmbols and the sample description. For GSE, it is mentioned that object class is not available. Could someone help me in filtering the gene symbols and sample description from the expression set that is created using GSE data?

ADD REPLY
1
Entering edit mode

Hey Natasha, so, you can obtain some data but the issue is that the IDs are Affy probe IDs? If that's the case, then there are ways to convert these to gene symbols, here:

ADD REPLY
0
Entering edit mode

Hi Kevin, Yes, these IDs are the Affy probe IDs. Thanks for the informative links.

Could you please suggest how to get the sample descriptions from the expression set created using GSE files.?

For instance , using the eset of GDS files I obtain, pData(eset)

             sample                         cell.type
GSM762810 GSM762810                 uncultured Islets
GSM762811 GSM762811                 uncultured Islets
GSM762812 GSM762812                 uncultured Islets
GSM762813 GSM762813                 uncultured Islets
GSM762814 GSM762814 expanded Islet - dedifferentiated
GSM762816 GSM762816 expanded Islet - dedifferentiated
GSM762817 GSM762817 expanded Islet - dedifferentiated
GSM762819 GSM762819 expanded Islet - dedifferentiated
GSM762815 GSM762815 expanded Islet - redifferentiated
GSM762818 GSM762818 expanded Islet - redifferentiated
GSM762820 GSM762820 expanded Islet - redifferentiated

Likewise, is there any syntax that can be used to find the cell.type / sample description?

I tried pData(eset) for GSE after creating the eset following the instructions given here, it didn't work though.

ADD REPLY
1
Entering edit mode

Oh, I just tried it for that dataset, and it works like this:

gseEset <- getGEO("GSE30732", GSEMatrix=TRUE)

The ExpressionSet is stored as gse[[1]]:

gse[[1]]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 32968 features, 11 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM762810 GSM762811 ... GSM762820 (11 total)
  varLabels: title geo_accession ... sample origin:ch1 (37 total)
  varMetadata: labelDescription
featureData
  featureNames: 7892501 7892502 ... 8180179 (32968 total)
  fvarLabels: ID GB_LIST ... category (12 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
Annotation: GPL6244 


head(exprs(gse[[1]]))
        GSM762810 GSM762811 GSM762812 GSM762813 GSM762814 GSM762815 GSM762816
7892501  144.1970  154.9120  204.9800  170.7820  160.4320  177.0040  164.8360
7892502   87.3393   66.1895  236.0300   67.5643  118.0320  110.5290  124.1790
7892503   17.6620   25.2553   25.0095   35.4209   34.8682   16.8056   19.0721
7892504  968.6140  552.2480  629.9800  877.5390 1119.0500  395.1390  657.9350
7892505   48.9950   56.7481   42.8267   41.8391   42.8017   52.4028   40.8999
7892506   17.0715   11.3087   87.3541   18.2260   33.5276   33.4526   36.1377
        GSM762817 GSM762818 GSM762819 GSM762820
7892501  160.8070  192.1560   97.9872  256.8820
7892502   93.8661  125.8370  109.0660  131.7860
7892503   19.9689   24.9181   32.5494   31.5395
7892504  934.4180  851.7840  803.8410  736.0790
7892505   58.3426   46.8862   49.7869   50.8334
7892506   29.2909   34.7242   35.1665   33.4917

pData(gse[[1]])[,c(1,6,8)]
                                         title type              source_name_ch1
    GSM762810     Uncultured Islets- donor-S64  RNA            Uncultured Islets
    GSM762811     Uncultured Islets- donor-G31  RNA            Uncultured Islets
    GSM762812     Uncultured Islets- donor-I18  RNA            Uncultured Islets
    GSM762813     Uncultured Islets- donor-F14  RNA            Uncultured Islets
    GSM762814    S68-expanded islet, untreated  RNA          expanded islet, UTR
    GSM762815 S68_expanded islet, RC treatment  RNA expanded islet, RC treatment
    GSM762816    S63-expanded islet, untreated  RNA          expanded islet, UTR
    GSM762817    I34-expanded islet, untreated  RNA          expanded islet, UTR
    GSM762818 I34-expanded islet, RC treatment  RNA expanded islet, RC treatment
    GSM762819    S66-expanded islet, untreated  RNA          expanded islet, UTR
    GSM762820 S66-expanded islet, RC treatment  RNA expanded islet, RC treatment
ADD REPLY

Login before adding your answer.

Traffic: 1842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6