Modifying a GET request for article retrieval in R
0
0
Entering edit mode
2.5 years ago

Hi everyone

I am using the R package europepmc (https://cran.r-project.org/web/packages/europepmc/europepmc.pdf) and the function epmc_ftxt for obtaining the full texts of some articles given their PMC ID. However for many articles I keep getting the following error:

"Request failed [404]. Retrying in 1 seconds... Error in epmc_ftxt("PMC2701033") : Not Found (HTTP 404). Failed to retrieve full text.."

That is because the article does not belong to the OpenAccess subset (I guess). However I checked and saw that my University has the license to access that article. So my question is... How can I edit the get request in the function in order to tell epmc_ftxt that I can actually access that article? Code below:

    #' This function loads full texts into R. Full texts are in XML format and are
    #' only provided for the Open Access subset of Europe PMC.
    #'
    #' @param ext_id character, PMCID. 
    #'   All full text publications have external IDs starting 'PMC_'
    #'
    #' @export
    #' @return xml_document
    #'
    #' @examples
    #'   \dontrun{
    #'   epmc_ftxt("PMC3257301")
    #'   epmc_ftxt("PMC3639880")
    #'   }
    epmc_ftxt <- function(ext_id = NULL) {
      if (!grepl("^PMC", ext_id))
        stop("Please provide a PMCID, i.e. ids starting with 'PMC'")
      # call api
      req <-
        httr::RETRY("GET",
                    base_uri(),
                    path = paste(rest_path(), ext_id,
                                 "fullTextXML", sep = "/"))
      # check for http status
      httr::stop_for_status(req, "retrieve full text.")
      # load xml into r
      httr::content(req, as = "text", encoding = "utf-8") %>%
        xml2::read_xml()
    }
GET retrival R request articles • 458 views
ADD COMMENT
0
Entering edit mode

Well, the API specification does not list any means of authentication, e.g. via tokens and if this was a problem I would also expect another status than 404. (e.g. 401 or 403). It just seems that those full texts are not available in this format, because also with curl you can't get them:

Doesn't work (your example)

curl -X GET --header 'Accept: application/xml' 'https://www.ebi.ac.uk/europepmc/webservices/rest/PMC2701033/fullTextXML'

Works:

curl -X GET --header 'Accept: application/xml' 'https://www.ebi.ac.uk/europepmc/webservices/rest/PMC2601033/fullTextXML'
ADD REPLY

Login before adding your answer.

Traffic: 1919 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6