Problem with downloading full microarray datasets with getGEO function
1
0
Entering edit mode
3.9 years ago
Microuser • 0

Hello,

I want to download full datasets that have GPR file format, but it only retrieves 3 out of 12 files:

gset <- getGEO('GSEnumber', GSEMatrix = TRUE, getGPL = FALSE)
eset <- exprs(gset[[1]])

I can use these codes for other datasets, which happen to have CEL format.

Any help is very appreciated. Thanks

GEOquery microarray • 1.0k views
ADD COMMENT
0
Entering edit mode

Please provide an example GSE number so that I can properly investigate

ADD REPLY
0
Entering edit mode

Thanks Kevin. It's GSE34810

ADD REPLY
0
Entering edit mode

Posted an answer below

ADD REPLY
2
Entering edit mode
3.9 years ago

For whatever reason, there are two series matrix files - it could be related to the 2 different array versions (GPL15078 and GPL15079). This is also GenePix data, which is not as well supported as the typical Affymetrix, Agilent, and Illumina chips.

You can download the series matrices and then import them separately. Thereafter, I guess that you could bind them together in some way:

wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE34nnn/GSE34810/matrix/GSE34810-GPL15078_series_matrix.txt.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE34nnn/GSE34810/matrix/GSE34810-GPL15079_series_matrix.txt.gz

In R:

require(GEOquery)

gset1 <- getGEO(filename = 'GSE34810-GPL15078_series_matrix.txt.gz', GSEMatrix = TRUE)
Parsed with column specification:
cols(
  ID_REF = col_double(),
  GSM855696 = col_double(),
  GSM855697 = col_double(),
  GSM855698 = col_double()
)


gset2 <- getGEO(filename = 'GSE34810-GPL15079_series_matrix.txt.gz', GSEMatrix = TRUE)
Parsed with column specification:
cols(
  ID_REF = col_double(),
  GSM855699 = col_double(),
  GSM855700 = col_double(),
  GSM855701 = col_double(),
  GSM855702 = col_double(),
  GSM855703 = col_double(),
  GSM855704 = col_double(),
  GSM855705 = col_double(),
  GSM855706 = col_double(),
  GSM855707 = col_double()
)

It also appears that this data is already normalised and log [base 2] transformed:

par(mfrow = c(1,2), mar = c(6,2,2,2))
boxplot(exprs(gset1))
boxplot(exprs(gset2), las = 2)

g

-----------------------

The other option is to retrieve the raw data GPR files and normalise them together using limma

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1913 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6