wiki:BadcInterfaces

Version 11 (modified by mjuckes, 12 years ago) (diff)

--

BADC Interfaces: how will the IS-ENES and QESDI portals interact with BADC?

See newer page at CMIP5 in MOLES.

(see ticket 21274)

The IS-ENES and QESDI portals will advertise data and metadata from the BADC archive.

As far as possible, both portals will be steps on the development of a generic CEDA portal, though project dependent requirements are likely to introduce some features which are not required for CEDA.

Metadata should be extracted from the MOLES catalogue, which is stored in and EXIST database. The building blocks within the catalogue are (the following is copied from the new (Sept. 2009) MOLES Editor help pages):

-- Data Entity

a general, high level metadata document allowing the grouping of data granules and deployments data together in a logical manner - e.g. bringing together a data granule describing a particular data file with the observation stations, data production tools and activities involved in creating it.

-- Data Granule

acts as a 'wrapper' to specific data files, providing summary metadata in order to aid discovery. NB, data granules can be created or augmented by ingesting data directly into the system

-- Activity

a metadata document describing an activity used to create a dataset.

-- Data Production Tool (DPT)

a metadata document describing an instrument involved in the creation of a dataset.

-- Observation Station

a metadata document describing the observation station used to create a dataset.

-- Deployment

a simple metadata document allowing the grouping of Activity, Data Production Tool and Observation Station documents in a logical manner - i.e. to pull together the why/how/where of data production. NB, any number of activities, data production tools and observation stations can be specified in a single deployment and these will all share the same temporal and spatial coverage summary information. Also note, data entities do not specify references to activities, DPTs or observation stations metadata document directly; rather they reference Deployment documents which, in turn, reference these documents. Note, a Deployment atom is just a specialisation of the 'Activity' atom type - but with a subtype of, 'Deployment'.

End quote.

A data granule can be associated with (i.e. created from) a CSML file. The amount of metadata is to restrictive for this to be the association point for a Web Map Service. The granule atom (catalogue entry) will contain contact info, spatial coverage, and parameter names.

A data entity can contain an unlimited number of data granules. It has a "Summary" which can be used to provide context in the web portal. It can also be associated with WMSs.

An activity or data entity can have an "extended metadata" option, allowing more specialised information to be attached. E.g. METAFOR CIM.

There is no means of linking data granules to deployments ..... data granules link to data entities and vice versa. A data entity can have multiple deployments, in priciple with multiple activities and observation stations.

To make sense of this I want to restrict data entites to having a single observing station and a single data production tool -- is this consistent with the philosophy of moles?

The scope of a data entity is constrained by what we want to associate with a WMS, which is in turn constrained by what we expect the WMS UI to be able to handle. The CMIP5 data reference syntax has the concept of an atomic dataset, defined by "activity, institute, model, experiment/scenario, data frequency, modeling-realm, variable name, local ensemble member, and version". Does it make sense to have a data entity defined by "institute, model, experiment/scenario, data frequency" and perhaps version, and allow the choices of modeling-realm, variable and local ensemble member to be held within a single WMS?

Alternatively, we create a supplementary XML file for every portal viewable data entity. This file contains an element for every data granule which specifies all the information which cannot be placed in a machine identifiable way within the existing MOLES structure.

See also