Skip to content
John Graybeal edited this page Jul 29, 2014 · 36 revisions

Modeling

(See initial modeling discussion here)

We are basically going to implement the following conversion approach. See at the end for remaining questions.

Unit class

An instance of the Unit class captures one main unit entry in the XML. It may have a primary name (via property hasName) and zero or more aliases (via hasAlias property).

hasDefinition property

This is a functional property to capture the <def> element from the XML description.

hasName property

This is a functional property that indicates the primary name of a given Unit instance. In particular, a unit may not have a primary name.

hasAlias property

Indicates an alternate name for a given Unit instance. A unit can have zero or more aliases associated.

hasSymbol property

Indicates a symbol associated with a given Unit instance. A unit can have zero or more symbols associated.

UnitName class

Instances of the UnitName class capture a name or alias associated with an instance of a class Unit.

namesUnit property

With UnitName as domain, this functional property indicates the Unit instance associated with the name.

hasCardinality property

With UnitName as domain, this functional property indicates whether the name is "singular" or "plural".

Remarks:

  • The approach clearly separates the concept of unit from any associated specific names/aliases.
  • Those names/aliases will have URIs per se (so they could be self-resolvable)

Unit and UnitName identification

Both Unit and UnitName instance URIs will share the same namespace. The id part for each UnitName instance will simply be the associated name itself; this is appropriate because these names are both user- and web-friendly.

The Unit case needs some extra handling. First, all units must have <def> strings, so we will use these strings as basis for the identification. Although the primary names would be a good candidate for identification, some units lack such names. Moreover, even if such names were always available, we need to avoid collision with the corresponding UnitName instances that would also get those names for its identification, unless we put those on a different namespace.

So, for the generation of Unit instances, the conversion tool will apply an arbitrary but deterministic translation (md5 or sha hash) of the <def> string, and use the last 6 characters. If a collision results, the tool will add a '+' to the string and re-hash and repeat until a unique name results.

Example

The following entry from the original vocabulary:

        <unit>
            <def>'/60</def>
            <name><singular>arc_second</singular></name>
            <symbol>"</symbol>
            <symbol>&#x2033;</symbol>           <!-- DOUBLE PRIME -->
            <aliases>
                <name><singular>angular_second</singular></name>
                <name><singular>arcsecond</singular></name>
                <name><singular>arcsec</singular></name>
            </aliases>
        </unit>

will result in the RDF representation:

@prefix :        <http://mmisw.org/ont/mmitest/udunits2-accepted/> .
@prefix prop:    <http://mmisw.org/ont/mmitest/udunits2-prop/> .

:2a1369 
      a                      :Unit ;
      prop:hasDefinition     "'/60" ;
      prop:hasName           :arc_second ;
      prop:hasAlias          :arcsec, :angular_second, :arcsecond ;
      prop:hasSymbol         "\"", "″" ;

:arc_second
      a                      :UnitName ;
      prop:namesUnit         :2a1369;
      prop:hasCardinality    "singular";

:arcsec
      a                      :UnitName ;
      prop:namesUnit         :2a1369;
      prop:hasCardinality    "singular";

:angular_second
      a                      :UnitName ;
      prop:namesUnit         :2a1369;
      prop:hasCardinality    "singular";

:arcsecond
      a                      :UnitName ;
      prop:namesUnit         :2a1369;
      prop:hasCardinality    "singular";

Any other questions?

(2) How will Units and UnitNames be seen in ORR?

A(prototype): As plaintext values in the (first) Name column.

(3) How will URIs be generated for the two class types?

A: The concepts in a single file (udunit2-accepted.xml in this case) all have the same URL base. The class type for a particular concept is not visible in its URI.

It seems your concern here is still wrt to what you call "vocabularies" that is -it seems to me- in the sense of the voc2rdf tool. First, the URLs for these two class will, in principle, just share the same namespace as the unit and unitname instances. Secondly, we are capturing a complete XML file (say, the "accepted" one) in a single ontology.

Just about what the URL looks like in the end.

Clone this wiki locally