Hi,
I am trying to extract gene ontology term from plant ensembl biomart using the following code:
from pybiomart import Server
server = Server(host='http://plants.ensembl.org')
#print server.list_marts() # available marts
mart = server['plants_mart'] # connecting plants_mart
#print mart.list_datasets() # print available datasets
dataset = mart['zmays_eg_gene']
print dataset.query(attributes=['go'], filters={"transcript_id": ['Zm00001d000475_T001']})
but I have this error:
pybiomart.base.BiomartException: Query ERROR: caught BioMart::Exception::Usage: WITHIN Virtual Schema : default, Dataset zmays_eg_gene NOT FOUND
even though 'zmays_eg_gene was listed in mart.list_datasets()
The attributes of getBM can be changed (I usually get go and goslim), and then you can just use the write.table function to save the file if you want to get rid of R and work back in python.
Thanks a lot, worked perfectly with no complications:
library(biomaRt)
host="plants.ensembl.org"
mysets<-listDatasets(useMart("plants_mart", host = host))
myusemart <- useDataset("zmays_eg_gene", mart = useMart("plants_mart", host = host))
resultTable <- getBM(attributes=c("go_id", "ensembl_gene_id"), mart = myusemart, filters = "ensembl_gene_id", values = c("Zm00001d002426"), uniqueRows=F)
# to get more info about available attributes and filters
listFilters(mart = myusemart, what = c("name", "description"))
listAttributes(myusemart)
In pybiomart even though you give the host as ''plants.ensembl.org'' the virtual schema is set to default. that's the reason why you can find any ''zmays_eg_gene".
I'm not sure whether you can change the virtual schema in pybiomart but you can do the same using bioservices package in python
from bioservices import BioMart as biomart
s = biomart('plants.ensembl.org')
services = [x for x in s.registry()]
pd.DataFrame(services)
dataset = 'zmays_eg_gene'
filt = 'transcript_id'
s.new_query()
s.add_dataset_to_xml(dataset)
s.add_filter_to_xml(filt, value = 'Zm00001d000475_T001')
s.add_attribute_to_xml('go')
xmlq = s.get_xml()
print(xmlq)
xmlq = xmlq.replace('virtualSchemaName = "default"',\
'virtualSchemaName = "plants_mart"')
print(xmlq)
res = s.query(xmlq)
print(res)
Thanks a lot, worked perfectly with no complications:
Thanks