I tried without success different ways to retrieve the current list of ensemble gene ids including the gene symbol for only protein coding genes by using the R library Biomart. Here is the code:
library(biomaRt)
ensembl = useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
results <- getBM(attributes=c("ensembl_gene_id","gene_biotype"),filters = c("ensembl_gene_id","biotype"), values=list("protein_coding"), mart=ensembl)
results
The error message is:
Error in names(values) <- filters :
'names' attribute [2] must be the same length as the vector [1]
I also need the gene symbol for each ensemble gene id (example, TSPAN6). I eventually included "hgnc_id" in both the attributes vector as well as the filters one with a similar error message as the one shown above. What should I do to accomplish the task ? Many thanks for any comment.
In attributes you have
gene_biotype
, while in filtersbiotype
, maybe that has something to do with the error?