Entering edit mode
5.6 years ago
dmbergau
▴
30
I am trying to download a number of GSE datasets with GEOquery using codes that looks like the following
library(GEOquery)
library(Biobase)
library(limma)
gset <- getGEO("GSE54884", GSEMatrix =TRUE, getGPL=FALSE)
and I am receiving this error:
Error in open.connection(x, "rb") : HTTP error 403.
Here is my session information:
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows >= 8 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] GEOquery_2.50.5 Biobase_2.42.0 BiocGenerics_0.28.0
This seems to work for other GSE datasets such as:
gse <- getGEO("GSE781",GSEMatrix=FALSE)
My understanding is that HTTP error 403 is an access issue, but I thought all GEO Datasets are open access. Any ideas about what I might be doing wrong?
Same problem here.
Related post is on Bioconductor: https://support.bioconductor.org/p/120475/
Just to be clear, the reported problem here is due to a problem at NCBI, not with GEOquery. Until NCBI resolves its network/hosting issues, access via GEOquery will not work.
Yep, cheers Sean.
I would suggest reporting issues to NCBI GEO staff (geo@ncbi.nlm.nih.gov). I agree that recent behavior of GEO has been more spotty than previous.
It looks such situation frequently occurred recently. Before that, I seldom meet such problem. Maybe it will be helpful to send a email to GEO database service stuff. One of the solution is that you try for multiple times and sometimes it works.
For a workaround with microarray series, manually download the "GSExxxx_series_matrix.txt.gz" file from the GEO Accession page ("Series Matrix File(s)" link in "Download Family" section) and specify the filename in the getGEO function call to its location.
It seems that the GEO ftp pages to download the series matrix files are unavailable and the GEOquery error has changed to a 404.
Edit: Resolved.