Problem To Transform An Owl File Related To Cell Line Ontology Using An Xslt
3
2
Entering edit mode
13.4 years ago

I am trying to parse the owl file called "CLO_v2.0.27.owl" available at the bottom of this webpage http://bioportal.bioontology.org/ontologies/45376.

I want to extract name of cell lines that are subClassOf CLO_0000019 and that are embedded in this kind of element:

<owl:Class rdf:about="&obo;CLO_0008089">
    <rdfs:label>NCI-H460</rdfs:label>
    <rdfs:subClassOf rdf:resource="&obo;CLO_0000019"/>
    <rdfs:seeAlso>ATCC: HTB-177</rdfs:seeAlso>
</owl:Class>

In order to do that I have this XSLT :

<xsl:stylesheet version="1.0" xmlns:xsl="&lt;a href=" http:="" www.w3.org="" 1999="" XSL="" Transform"="" rel="nofollow">http://www.w3.org/1999/XSL/Transform" 
xmlns:owl="http://www.w3.org/2002/07/owl#" 
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" 
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/">

<xsl:output method="text"/>
<xsl:template match="/">
    <xsl:apply-templates select="//owl:Class[rdfs:subClassOf[@rdf:resource = '*CLO_0000019']]"/>
</xsl:template>

<xsl:template match="owl:Class">
    <xsl:value-of select="./rdfs:label"/>
        <xsl:if test="not(position()=last())">
            <xsl:text>
</xsl:text>
        </xsl:if>
</xsl:template>

</xsl:stylesheet>

But unfortunatelly my textfile is empty. And if I change a single line (see below) I get a text file filled out but with items I am not interested in.

<xsl:apply-templates select="//owl:Class[rdfs:subClassOf[@rdf:resource]]" />

So if there is XSLT masters that could correct my newbie mistake, I would appreciate very much.

I am using the evaluation version the software XMLspy

xml • 3.6k views
ADD COMMENT
1
Entering edit mode

BTW: the scope of rdfs:seeAlso is a resource, not a literal. So (I know that's not your fault) but the statement 'rdfs:seeAlso' in your example seems wrong to me.

ADD REPLY
0
Entering edit mode

Thanks Mélanie !

ADD REPLY
3
Entering edit mode
13.4 years ago

Apologies for not being able to resist to give the SPARQL version, as done in R with rrdf...

library(rrdf)

clo = load.rdf(file="CLO.owl")
# never mind the warnings
summarize.rdf(clo)

sparql = "
  PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
  PREFIX obo: <http://purl.obolibrary.org/obo/>

  SELECT ?line ?label WHERE {
    ?line rdfs:subClassOf obo:CLO_0000019 ;
          rdfs:label ?label .
  }
"

lines = sparql.rdf(clo, sparql)

Output looks like:

> lines[1:3,]
      line              label         
 [1,] "obo:CLO_0002076" "C1.18.4"     
 [2,] "obo:CLO_0006202" "IGF104/90"   
 [3,] "obo:CLO_0002345" "CDR2"
ADD COMMENT
0
Entering edit mode

Egon, your answer is very welcome. Thanks a lot.

ADD REPLY
2
Entering edit mode
13.4 years ago

Is it what you need ? I'm not sure about what you want for your output. Your select statement was wrong and AFAIK, there is no wildcards in XSLT.

<xsl:stylesheet version="1.0" xmlns:xsl="&lt;a href=" http:="" www.w3.org="" 1999="" XSL="" Transform"="" rel="nofollow">http://www.w3.org/1999/XSL/Transform" 
    xmlns:owl="http://www.w3.org/2002/07/owl#" 
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/">

<xsl:output method="text"/>
<xsl:template match="/">
    <xsl:for-each select="//owl:Class[rdfs:subClassOf/@rdf:resource='&lt;a href=" http:="" purl.obolibrary.org="" obo="" CLO_0000019'"="" rel="nofollow">http://purl.obolibrary.org/obo/CLO_0000019']">
        <xsl:value-of select="@rdf:about"/>
        <xsl:text>  </xsl:text>
        <xsl:value-of select="rdfs:label"/>
        <xsl:text>  </xsl:text>
        <xsl:value-of select="rdfs:seeAlso"/>
        <xsl:text>
</xsl:text>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>
ADD COMMENT
1
Entering edit mode
13.4 years ago

I am wondering why you prefer to use XSLT to get the subclasses? Wouldn't it be easier and more reliable to load the file and query it (using Jena or the OWLAPI), or even try and see if the resource is available via a SPARQL endpoint? This would for example allow you to load resources imported by the CLO.

ADD COMMENT
2
Entering edit mode

Well my answer was you should try and avoid XSLT, as this will most probably get you in trouble in the long run :) For example, you are using [?] as a template, but it would be perfectly valid OWL to write [?] [?] [?]

The CLO developers use Protege to edit their file, as AFAIK there is no way to predict which serialization will happen.

ADD REPLY
2
Entering edit mode

You may also want to consider that CLO_0000019 is a defined class, which means a reasoner should be run before trying to get subclasses if one wishes to get asserted and inferred subclasses. Using XSLT is your choice of course, I was just trying to point to some limitations and other options available.

ADD REPLY
1
Entering edit mode

Hello Melanie. I use XSLT because it is very convenient to use it. Even if I am a newbie and I got a bug. By the way since you submited a comment and not an answer could you please cut/paste it in the comment section just below my answer. Thanks in advance.

ADD REPLY
1
Entering edit mode

Mélanie, I do understand that SPARQL is the solution of choice to handle a RDF graph. But here the statements were all identically structured , so I think XSLT is a solution of choice here, especially to transform the XML into something else.

ADD REPLY
0
Entering edit mode

We all agree. That's why I said "here, the statements were all identically structured" :-)

ADD REPLY

Login before adding your answer.

Traffic: 1975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6