wiki:EmbeddedReferences

Embedded References

Background

For some datasets it will be useful to provide links to additional information.

Ideally the user interface should pick up these links from a clear syntactical structure.

The CF standard does not at present allow for this, though there are proposals under discussion.

There is a chicken and egg problem: we can't clarify the requirements until we have embedded references to work from, the design of embedded references is held up by a lack of clear use cases.

Here, we will describe an ad hoc scheme for embedding references which will enable software development, with the objective to adapting the software to other syntax when it becomes available.

Proposal

Add an extensions attribute which consists of a string of extension items, separated by ';'.

An extension item will take the form {<URI>}[{URI category[:format]}]<item list> where the URI category is optional. E.g.

extension: "{proj.badc.rl.ac.uk/qesdi/wiki/EmbeddedReferences}{text/html}*extension;"

This self referential example says that there is a web page explaining what the attribute "extension" means at http://proj.badc.rl.ac.uk/qesdi/wiki/EmbeddedReferences.

The item list will be comma separated words. The "*" at the start indicates that it is an attribute.

A second example: extension:"{<wdcc>/<vocab>}{text/html}*wdcc_name, $wdcc_name, air_temperature_2m; {www-pcmdi.llnl.gov}{text/html}*model, *scenario;"

This says that the string "air_temperature_2m", the usage of "wdcc_name" and the value of the wdcc_name attribute are explained at the given URL. There is clearly reduncancy here, but that gives flexibility. If the above attribute is included at the head of a file, the attribute wdcc_name can be used on variables and identified by informed clients. It also says that the attributes "model" and "scenario" are as defined by PCMDI.

A third example: extension:"{<ndg vocab server>}{vocab/xml:skos}@[cf]flag_meanings;"

This means that the value of an element of the flag_meanings array, where the interpretation of flag_meanings as an array is as defined in the CF convention, is defined by the ndg vocab server, which returns a skos document. flag_meanings is here a CF attribute. To associate a vocab server with elements of a character array stored as a netcdf variable, an element of the form "@[v]my_character_array" should be included.

The syntax mimics the cf common_concept proposal { CF 24}, but there are differences:

  1. referring to a page rather than a domain;
  2. adding a specification of what is at the domain;
  3. defining an attribute or other element rather than providing a string which acts as an alias;
  4. the common_concept attribute is positional (the name it contains refers to the variable containing the attribute) whereas this extension would be a global attribute guiding interpretation of variable attributes.
  5. Governanmce issues are not addressed -- the contents of a URL are the responsibility of the domain owner.

Item 3 is a particularly significant extension. Item 2 is a detail which will facilitate use by informed clients.

It differs from the CF namespace tags discussion,  CF #27, in avoiding any modification of the attribute names. This may be advantageous in preserving functionality of existing software.

Scope

A global extension attribute will apply to all the file contents, if it occurs as a variable attribute, it will only apply to that variable. Note that whereas

(Ex. 1) common_concept:"{wdcc.dkrz.de}air_temperature_2m"

implies that "air_temperature_2m" is an alias for the variable containing this attribute,

(Ex. 2) extension:"{wdcc.dkrz.de}air_temperature_2m;"

only implies that there is a definition of "air_temperature_2m" available at wdcc. To associate this with the variable we need another attribute, e.g. wdcc_name. An alternative form,

(Ex. 3) extension:"{wdcc.dkrz.de}*wdcc_name;"
        wdcc_name:"air_temperature_2m"

might be preferable, because (Ex. 2) does not tell the client which attribute might contain "air_temperature_2m". Or perhaps:

(Ex. 4) extension:"{wdcc.dkrz.de}air_temperature_2m(alias);"

in which the qualifier "alias" inidcates what the significance of a plain string is within its current scope.

Review

The proposal splits the extension into several components:

item list

This is the list of things defined in an external namespace. The things defined can be:

  1. strings: a string not starting with "*", "@" or "$"
  2. attribute names: a string starting with "*", followed by the attribute name;
  3. attribute values: a string starting with "$", followed by the attribute name;
  4. components of an attribute interpreted as an array: a string starting with "@", or "@[cf]" if the attribute value is a string which is to be interpreted as an array of elements delimited by blanks, followed by the attribute name;
  5. component of a character array: a string starting with @[v], followed by the variable name.

In all cases, the string cannot contain blanks or commas.

resource

There is a resource URL -- which might be a vocab server returning a structured document, or it might be a web page or pdf designed for human readability.

resource definition

In order to facilitate some automated processing, there is an option to add information about the resource. This should enable an intelligent client to process some of the information available at the resource.

The URI category should be one of:

  1. text/html: a descriptive text resource
  2. vocab/xml: a vocabularly server returning xml.

Usage

Consider the 3rd example, and a file be displayed by a WMS user interface. The user moves the cursor over the map and clicks: the user interface requests information about the point under the cursor. The WMS returns a short document giving the value of the field, the flag_meanings element associated with it. The WMS also recognises that there is a vocab server associated with flag_meanings and returns the URL of the associated namespace, and information about the format which will be returned by the vocab server.

The user interface can either display the URL or retrieve the associated information and display that. This will depend on whether the format specification corresponds to something the user interface can handle. E.g.

<featureInfo resource="http://vocab.ndg.nerc.ac.uk/term/wwf1/current/NT0108" resourceType="skos/xml">
   Some default text to dispay if client can't or won't contact vocab server
</featureInfo>

The vocab server would return something like:

<skos:Concept rdf:about="http://vocab.ndg.nerc.ac.uk/term/wwf1/current/NT0108">
  <skos:externalID>NT0108</skos:externalID>
  <skos:prefLabel>Catatumbo moist forests</skos:prefLabel>
  <skos:altLabel />
  <skos:definition>
     <abstract> Blah blah blah </abstract>
     <resource url="http://wwf.....">Follow this link for further information.</resource>
  </skos:definition>
  </skos:Concept>

which would have to be parsed by the user interface.

Alternatively, we could get the WMS to call the vocab server and return HTML to the user interface.