So I am trying to access protein information using interpro ids on ensemblbacteria. I have written a MySQL code in R however, I can't quite figure out how to get protein information using the ids using programming language. I have put in a picture of approximately what I want:
And this is my code:
library(tidyverse)
library(RMySQL)
con <- dbConnect(MySQL(), host = "mysql-eg-publicsql.ebi.ac.uk",
user = "anonymous", password = "",
port = 4157)
ds <- dbGetQuery(con, "SHOW DATABASES")
dim(ds)
bacteria <- grep("bacteria", ds$Database, value = TRUE)
dbGetQuery(con, "USE bacteria_0_collection_core_47_100_1;")
dbGetQuery(con, "SHOW TABLES")
You may need to ask Ensembl support if the mysql database has the information you are looking for.
This sort of query may be what you need: http://bacteria.ensembl.org/Multi/Search/Results?species=all;idx=;q=IPR000562;site=ensemblunit
As per my comment to @GenoMax below, it is unclear what information you are trying to fetch, and from what starting point.
For instance, if you start from an Ensembl ID - say SAMN02982918_2340 - the approach suggested by Aleena looks sensible. If you instead are starting from an Interpro ID - like @GenoMax suggested instead - REST API end point would not be enough, and better go with SQL or the API. The SQL stmt below might be a (early) starting point
Finally, if you want to go down the DB route, as per the R code above, please consider that there are currently (Ensembl 109/56) 128 bacteria collection databases each hosting a number of species/strains to look into.
Happy to support and advise, but I suppose I'd need some clarification about the issue you are dealing with.
Thank you all for your help. I was able to figure it out on my own. If anyone is curious do let me know. For rest API it is not suited for my purpose.
Please post your solution as an "answer" to provide closure to this thread. If the comment by @sgiorgetti helped solve the problem then I can move that comment to an answer. Accept one or more of these answers.