I'm using the biomaRt package in R to pull down ensembl annotations for a set of Affy probeIDs. The features seem nice, but the service is very slow, taking 10 minutes or so for a single query (of no more than a dozen probes), and frequently times out completely:
library('biomaRt')
ensembl = useMart("ensembl")
ensembl = useDataset("hsapiens_gene_ensembl",mart=ensembl)
probePos = getBM(attributes=c("affy_huex_1_0_st_v2",
"hgnc_symbol",
"chromosome_name",
"start_position",
"end_position",
"strand"),
filters="affy_huex_1_0_st_v2",
values=probeList,
mart=ensembl)
The slow returns have persisted over a week or two, and don't seem dependent on time of day or anything like that. If I add a few more attributes, I get timeouts:
Request to BioMart web service failed. Verify if you are still connected to the
internet. Alternatively the BioMart web service is temporarily down.
So, my questions are:
- Is this a common problem for others?
- Is likely that the server is overloaded, or are my queries just too big? If the former, are there any mirrors available?
- If the latter, the next step is setting up a local server, I guess. Anyone have experience (good or bad) with that? Is it worth the hassle?
Argh just in case anyone is reading today, I'm getting serious timeouts as well! Yesterday it seemed really flaky, too. It's a shame, because it is such a good resource!
As I posted below, I sort of solved the problem by breaking my queries up into very small chunks, especially on the fields that return a lot of data. It's a pain to set up the first time, but once you automate it, doing it piecemeal isn't so bad.