As we can't query attributes from multiple pages, we need to loop through pages:
Querying attributes from multiple attribute pages is not allowed. To see the attribute pages attributes belong to, use the function attributePages.
We have in total 2579 attributes:
# get attributes
x <- listAttributes(ensembl)
table(x$page)
# feature_page homologs sequences snp snp_somatic structure
# 195 2219 55 38 38 34
nrow(x)
# [1] 2579 3
Loop through pages, and get attributes. Getting all attributes from all pages will not work. We might as well go and download the whole biomart.
But we could make it work by using filters, for example below I am querying hgnc_symbol == foxp2
. And I am only querying one atttribute per page i[ 1 ]
:
res <- lapply(split(x$name, x$page), function(i){
# what you want... but will not work
# getBM(attributes = i, mart = ensembl)
# but could work with filters. Here we are getting one attribute per page for one gene
getBM(attributes = i[ 1 ], filters = "hgnc_symbol", values = "foxp2", mart = ensembl)
})
One gene one attribute result object size is 7Mb, even if the query for all pages and all attributes did work, I doubt it would fit in average PC memory.
print(object.size(res), units = "Mb")
# 7 Mb
Here's my best guess. Running it runs into problems as ensembl seems to have set limits on the number of attribs you can pick:
Based on the vignette:
Does not work, I got this error
Right, I was too fast, but we'll have an answer ready soon :)