Patch for GEOquery package error: getGEO Error in download.file, cannot open destfile
1
3
Entering edit mode
6.8 years ago

Recently getGEO function of the package GEOquery suddenly started throwing the following error:

gse=getGEO("GSE106977",GSEMatrix=T)
#https://ftp.ncbi.nlm.nih.gov/geo/series/GSE106nnn/GSE106977/matrix/
#OK
#Found 2 file(s)
#/geo/series/GSE106nnn/GSE106977/
#Error in download.file(sprintf("https://ftp.ncbi.nlm.nih.gov/geo/series/%s/%s/matrix/%s",  :
 #cannot open destfile 'C:\Users\Marti\AppData\Local\Temp\Rtmp8cEqMH//geo/series/GSE106nnn/GSE106977', reason 'No such file or directory'

After some debugging I found that there was an error in the getAndParseGSEMatrices hidden function To overcome this issue I made a patch, that fix the issue.

First copy this code into Rstudio or wordpad

getAndParseGSEMatrices=function (GEO, destdir, AnnotGPL, getGPL = TRUE)
{
  GEO <- toupper(GEO)
  stub = gsub("\\d{1,3}$", "nnn", GEO, perl = TRUE)
  gdsurl <- "https://ftp.ncbi.nlm.nih.gov/geo/series/%s/%s/matrix/"
  b = getDirListing(sprintf(gdsurl, stub, GEO))
  b=b[-1]
  message(sprintf("Found %d file(s)", length(b)))
  ret <- list()
  for (i in 1:length(b)) {
    message(b[i])
    destfile = file.path(destdir, b[i])
    if (file.exists(destfile)) {
      message(sprintf("Using locally cached version: %s",
                      destfile))
    }
    else {
      download.file(sprintf("https://ftp.ncbi.nlm.nih.gov/geo/series/%s/%s/matrix/%s",
                            stub, GEO, b[i]), destfile = destfile, mode = "wb",
                    method = getOption("download.file.method.GEOquery"))
    }
    ret[[b[i]]] <- parseGSEMatrix(destfile, destdir = destdir,
                                  AnnotGPL = AnnotGPL, getGPL = getGPL)$eset
  }
  return(ret)
}
environment(getAndParseGSEMatrices)<-asNamespace("GEOquery")
assignInNamespace("getAndParseGSEMatrices", getAndParseGSEMatrices, ns="GEOquery")

Save the file as GEOpatch.R in your working directory and then, when loading the GEOquery library, source the saved file:

library(GEOquery)
source("GEOpatch.R")

Now it should get going...

gse=getGEO("GSE106977",GSEMatrix=T)
#https://ftp.ncbi.nlm.nih.gov/geo/series/GSE106nnn/GSE106977/matrix/
#OK
#Found 1 file(s)
#GSE106977_series_matrix.txt.gz
#trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE106nnn/GSE106977/matrix/GSE106977_series_matrix.txt.gz'
#Content type 'application/x-gzip' length 32707196 bytes (31.2 MB)

sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] GEOquery_2.40.0      Biobase_2.34.0       BiocGenerics_0.20.0
[4] BiocInstaller_1.24.0

loaded via a namespace (and not attached):
[1] httr_1.3.1      R6_2.2.2        tools_3.3.2     RCurl_1.95-4.10
[5] bitops_1.0-6    XML_3.98-1.9
microarray GEOquery GEO R • 9.2k views
ADD COMMENT
2
Entering edit mode

You have not include the output of sessionInfo(), but I suspect that you are using an outdated version of R/Bioconductor. Updating to the most recent Bioconductor release and R version should fix the problem that you are noticing.

As just an aside, rather than submitting a "tutorial" for how to patch an unknown version of GEOquery, the recommended approach is to report the error or problem via the official bug reporting mechanism to the author. In the case of GEOquery, that is a new issue on GitHub (which you did, I noticed). This way, everyone can benefit from any required code changes. In addition, having users copy-and-paste code without version information, testing, and build checks sidesteps the significant testing that occurs during the GEOquery and Bioconductor build processes.

ADD REPLY
0
Entering edit mode

Tagging Sean Davis.

ADD REPLY
0
Entering edit mode

Tnx a lot Martin. i was struggling alot with this prob and now i can finally do my stuff in relief.

ADD REPLY
0
Entering edit mode

Upgrading R to the current release version (3.4.) and then installing the matching GEOquery version will fix the problem noted in the original post.

ADD REPLY
3
Entering edit mode
6.7 years ago

Upgrading R and GEOquery will fix the issue noted here.

ADD COMMENT
0
Entering edit mode

packageVersion("GEOquery") must be >= 2.48 and maybe you have 2.40.

As it was happening to me:

Probably you are just "updating" GEOquery by only calling biocLite:

However be sure that you have the latest Bioconductor. Follow the instructions here:

  1. Quit your R session Start a new session with R --vanilla
  2. Run the command remove.packages("BiocInstaller", lib=.libPaths())
  3. Repeat that command until R says there is no such package.
  4. Run the command source("https://bioconductor.org/biocLite.R")
  5. Run biocValid() to ensure your installed packages are valid for the current version of Bioconductor, and follow the instructions it gives you.*

*Or skip the last step and directly update all the Bioconductor packages by calling biocLite()

ADD REPLY
0
Entering edit mode

hi I have the same problem but I'm afraid, I couldn't tackle the problem? what should I do?

> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GEOquery_2.52.0     Biobase_2.44.0      BiocGenerics_0.30.0 limma_3.40.6       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2         xml2_1.2.2         magrittr_1.5       hms_0.5.1         
 [5] tidyselect_0.2.5   R6_2.4.0           rlang_0.4.0        dplyr_0.8.3       
 [9] tools_3.6.0        assertthat_0.2.1   tibble_2.1.3       lifecycle_0.1.0   
[13] crayon_1.3.4       BiocManager_1.30.4 purrr_0.3.2        readr_1.3.1       
[17] tidyr_1.0.0        vctrs_0.2.0        curl_4.2           zeallot_0.1.0     
[21] glue_1.3.1         compiler_3.6.0     pillar_1.4.2       backports_1.1.5   
[25] pkgconfig_2.0.3
ADD REPLY

Login before adding your answer.

Traffic: 2154 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6