I am using GEOquery to download the soft files for a number of experiments from NCBI GEO. For example, here is one individual experiment:
> soft <- getGEO('GSE104278', GSEMatrix=T)
> soft
$GSE104278_series_matrix.txt.gz
ExpressionSet (storageMode: lockedEnvironment)
assayData: 0 features, 12 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM2793963 GSM2793964 ... GSM2793974 (12 total)
varLabels: title geo_accession ... strain:ch1 (48 total)
varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
pubMedIds: 30456366
Annotation: GPL24048
I want to extract the NCBI addresses for all SRA entries present in this expression set, so I am using the following command:
> sra <- data.frame(soft$GSE104278_series_matrix.txt.gz$relation.1)
> sra
soft.GSE104278_series_matrix.txt.gz.relation.1
1 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217300
2 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217301
3 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217302
4 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217303
5 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217304
6 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217305
7 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217306
8 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217307
9 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217308
10 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217309
11 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217310
12 SRA: https://www.ncbi.nlm.nih.gov/sra?term=SRX3217311
However, because I will be doing this on a several dozens to hundreds of files, I need a way to automate the sra data frame building step. I'd like to figure out how to automatically call the first "entry" (or column or whatever the ExprssionSet calls it) or pull out that entry to use it in the command. I'll need a new entry for each experiment I use, so typing them in one at a time is not practical.
I've tried something like this:
> sra <- data.frame(soft[1]$relation.1)
But that yields a data frame with 0 columns and 0 rows.
I've tried this:
> data <- soft[1,]$relation.1
or
> data <- soft[,1]$relation.1
But that yields an error of "incorrect number of dimensions".
I'm sure there's an easy solution, but I'm just not seeing it. Any help would be greatly appreciated. Thanks!
Wow! Thanks so much. That's exactly what I wanted. It makes sense and I understand better how the ExpressionSet is organized.
Glad I could help. Good luck.