I started a package for just this purpose yesterday. It is available from CRAN, as functionality is a bit limited today:
library(rrdf)
m1 = load.rdf("one.rdf")
m2 = load.rdf("two.rdf")
m3 = combine.rdf(m1, m2)
summarize.rdf(m3)
sparql.rdf(m3, "SELECT ?s ?p { ?s ?p ?o }")
It is wrapping around Jena and using rJava to interface to it.
There is in fact also a Bioconductor package called Rredland.
Because the rrdf
package now also supports SPARQL queries against remote databases, you can also do (following this BioStar answer):
library(rrdf)
endpoint = "http://rdf.farmbio.uu.se/chembl/sparql"
query = "
SELECT ?organism ?instance
WHERE {
?instance a <http://rdf.farmbio.uu.se/chembl/onto/#Target> ;
<http://rdf.farmbio.uu.se/chembl/onto/#organism> ?organism .
}
";
data = sparql.remote(endpoint, query)
As of version 1.4 you can also use on of the SPARQL variables as values for the row names. For example, to get a single column with the protein names as row names, you do:
query = "
SELECT ?organism ?title
WHERE {
?instance a <http://rdf.farmbio.uu.se/chembl/onto/#Target> ;
<http://purl.org/dc/elements/1.1/title> ?title ;
<http://rdf.farmbio.uu.se/chembl/onto/#organism> ?organism .
}
";
data = sparql.remote(endpoint, query, rowvarname="title")
Resulting in a R matrix like:
organism
Maltase-glucoamylase "Homo sapiens"
Sulfonylurea receptor 2 "Homo sapiens"
Voltage-gated T-type calcium channel alpha-1H subunit "Homo sapiens"
Dihydrofolate reductase "Escherichia coli (strain K12)"
Tyrosine-protein kinase ABL "Homo sapiens"
DNA-directed RNA polymerase beta chain "Escherichia coli (strain K12)"
Your first link connects to the Swedish version of wikipedia. For the english version: http://en.wikipedia.org/wiki/Resource_Description_Framework
nice we all Speak swedish RDF :D http://www.youtube.com/watch?v=9OfsABOGw3c&feature=related
Sorry, you lost me... Swedish RDF?
Oh, crap... OK, fixing... stupid, we're-so-smart-we-know-where-you-live websites... :(
Ah! Sorry about that; fixed now.