wiki:ServiceBinding

Version 6 (modified by rkl, 12 years ago) (diff)

Service binding to data provider metadata

NDG Service Binding

DIF

Introduction

How should NDG DIF producers make use of the DIF  related_url?

Examples on the web include the following:

Group: Related_URL
   URL_Content_Type: GET DATA > OPENDAP DIRECTORY (DODS) 
   URL: http://ferret.wrc.noaa.gov/cgi-bin/nc/data/coads_climatology.nc
   Group: Description
      The following dataset is available on this DODS Server:
      COADS Global Ocean Climatology SST, Air Temp and Winds
   End_Group
End_Group 
Group: Related_URL
   URL_Content_Type: GET SERVICE > GET WEB MAP SERVICE (WMS)  
   URL: http://www.kgis.scar.org/cgi-bin/kgis_wms?coads_climatology.nc
   Group: Description
       OGC WMS service for SCAR KGIS data 
   End_Group
End_Group 

The key point to note is that the URL_Content_Type should be taken from a  keyword list. Regrettably, this is not a version controlled list, and it appears volatile, we may have to take our own copy and control it properly ... (not all members have full definitions, but some are defined on this  page - which itself is from a previous version of the controlled list!).

There are two classes of problem: what do we want to happen with NDG produced DIFs? That is, what do we expect to produce in OUR MOLES documents, and where they should go in the DIF? The second is what should we do with third party DIFs and mini-MOLES?

NDG

From an NDG point of view, five content types are of significant interest, the first two of which are fairly obvious.

  • VIEW EXTENDED METADATA - which should point to an NDG-B browse instance of the corresponding data entity.
  • GET RELATED DATA SET METADATA (DIF) - which could point to a related dataset (DIF).

The next two are less obvious, we have a choice between

  • GET DATA, and
  • GET SERVICE > GET WEB FEATURE SERVICE (WFS)

I think we should begin by using GET DATA pointing at our proto-WFS as it is built, and change-over to the second when we have a properly functioning WFS. In any case, I think we should have all the CSML granule endpoints to be exposed, which will require populating the description rather carefully. Probably a sentence which defines what we are pointing at, and a sentence which is taken from the CSML granule title.

We may also want to use the

  • GET SERVICE > ACCESS WEB SERVICE configured for the DX, with the same comments as above for granules.

Of slightly lesser interest, we may also wish to use

  • VIEW PROJECT HOME PAGE.

RKL comment: Another useful thing to have here would be access to 'Extra' metadata held by the data provider (such as plain language documentation on a document server) 'VIEW DATA PROVIDER METADATA' perhaps?

So for example, for one dataset we might have:

<Related_URL>
 <URL_Content_Type> GET DATA </URL_Content_Type>
 <URL> http://badc.rl.ac.uk/ndgWFS?uri=badc.nerc.ac.uk__CSML__granule0123</URL>
 <Description> CSML: The dataset is available at this URL via the NDG WFS. This link
 is granule "Seasonal Mean Model Levels" </Description>
</Related_URL> 
<Related_URL>
 <URL_Content_Type> GET DATA </URL_Content_Type>
 <URL> http://badc.rl.ac.uk/ndgWFS?uri=badc.nerc.ac.uk__CSML__granule0124</URL>
 <Description> CSML: The dataset is available at this URL via the NDG WFS. This link
 is granule "Monthly Mean Model Levels" </Description>
</Related_URL> 
<Related_URL>
 <URL_Content_Type> GET DATA </URL_Content_Type>
 <URL> http://badc.rl.ac.uk/ndgWFS?uri=badc.nerc.ac.uk__CSML__granule0125</URL>
 <Description> CSML: The dataset is available at this URL via the NDG WFS. This link
 is to granule "Daily Data Model Levels" </Description>
</Related_URL>
<Related_URL>
 <URL_Content_Type> GET DATA </URL_Content_Type>
 <URL> http://badc.rl.ac.uk/ndgWFS?uri=badc.nerc.ac.uk__MOLES-B0__datasetA</URL>
 <Description> NDGA: The dataset is available at this URL via the NDG WFS. This link
 is to all data granules within the dataset. </Description>
</Related_URL>
<Related_URL>
 <URL_Content_Type> VIEW EXTENDED METADATA </URL_Content_Type>
 <URL> http://badc.rl.ac.uk/browse?uri=badc.nerc.ac.uk__MOLES-B1__datasetA</URL>
 <Description> NDGB: NDG browse metadata can be used to understand more about the data,
and it's relationship to other datasets </Description>
</Related_URL>

Note that this proposal is suggesting an internal controlled vocabulary for the Description entry, which is invoked if the first word, prior to a colon, is one of NDGA, NDGB, or CSML. Why not use the GCMD URL_Content_Type for this?:: Because we can do it today, without reference to them. Why not just harvest MOLES?:: Because we said we wouldn't in NDG2, and our security model doesn't allow us to do so!

These three signifiers will allow us to know that these are NDG related URLs, which means we can do something a bit more clever with them! For the moment, we will only care in detail about cases where CSML appears; in that case, the first sentence should be preserved into and out of MOLES, and the second should be preserved into and out of MOLES as a granule title (into will only occur for the production of mini-MOLES).

The appropriate entries will appear in MOLES at:

<dgMetadata>
    <dgMetadatRecord>
        <dgMetadataID>
            <repository>badc.nerc.ac.uk</repository>
            <schema>MOLES-B0</schema>
            <identifier>datasetA</identifier>
        </dgMetadataID>
        <dgMetadataDescription>
            <metadataDescriptionID>?Kev</metadataDescriptionID>
            <metadataDescriptionLastUpdated>...</metadataDescriptionLastUpdated>
            <abstract>..stuff.</abstract>
            <descriptionSection>
                <descriptionOnlineReference>
                    <dgSimpleLink>http://badc.rl.ac.uk/browse?uri=badc.nerc.ac.uk__MOLES-B1__datasetA</dgSimpleLink>
                    <dgReferenceClass>
                        <dgValidTerm>VIEW EXTENDED METADATA</dgValidTerm>
                        <dgValidTermID>
                            <ParentListID>GCMD URL Content Type Keywords</ParentListID>
                            <TermID>?Kev</TermID>
                        </dgValidTermID>
                        <Definition> NDGB: NDG browse metadata can be used to understand more about
                            the data, and it's relationship to other datasets</Definition>
                    </dgReferenceClass>
                    <dgReferenceName>Related_URL</dgReferenceName>
                </descriptionOnlineReference>
                <descriptionOnlineReference>
                    <dgSimpleLink>http://badc.rl.ac.uk/browse?uri=badc.nerc.ac.uk__MOLES-B0__datasetA</dgSimpleLink>
                    <dgReferenceClass>
                        <dgValidTerm>GET DATA</dgValidTerm>
                        <dgValidTermID>
                            <ParentListID>GCMD URL Content Type Keywords</ParentListID>
                            <TermID>?Kev</TermID>
                        </dgValidTermID>
                        <Definition> NDGA: The dataset is available at this URL via the NDG WFS.
                            This link is to all data granules within the dataset.</Definition>
                    </dgReferenceClass>
                    <dgReferenceName>Related_URL</dgReferenceName>
                </descriptionOnlineReference>
            </descriptionSection>
        </dgMetadataDescription>
        <dgDataEntity>
            <dgDataSetType/>
            <dgDataGranule>
                <dataModelID>
                    <repository>badc.nerc.ac.uk</repository>
                    <schema>CSML</schema>
                    <identifier>granule0123</identifier>
                </dataModelID>
                <instance>
                    <uri>http://badc.rl.ac.uk/ndgWFS?uri=badc.nerc.ac.uk__CSML__granule0123</uri>
                    <format>CSML</format>
                    <instanceComment> CSML: The dataset is available at this URL via the NDG
                        WFS.This link is granule </instanceComment>
                </instance>
                <dgGranuleSummary>
                    <dgGranuleName> Seasonal Mean Model Levels </dgGranuleName>
                    <dgParameterSummary>Stuff</dgParameterSummary>
                </dgGranuleSummary>
            </dgDataGranule>
        </dgDataEntity>
        <dgDataEntity>
            <dgDataSetType/>
            <dgDataGranule>
                <dataModelID>
                    <repository>badc.nerc.ac.uk</repository>
                    <schema>CSML</schema>
                    <identifier>granule0123</identifier>
                </dataModelID>
                <instance>
                    <uri>http://badc.rl.ac.uk/ndgWFS?uri=badc.nerc.ac.uk__CSML__granule0123</uri>
                    <format>CSML</format>
                    <instanceComment> CSML: The dataset is available at this URL via the NDG
                        WFS.This link is granule </instanceComment>
                </instance>
                <dgGranuleSummary>
                    <dgGranuleName> Monthly Mean Model Levels </dgGranuleName>
                    <dgParameterSummary>Stuff</dgParameterSummary>
                </dgGranuleSummary>
            </dgDataGranule>
        </dgDataEntity>
        <dgDataEntity>
            <dgDataSetType/>
            <dgDataGranule>
                <dataModelID>
                    <repository>badc.nerc.ac.uk</repository>
                    <schema>CSML</schema>
                    <identifier>granule0124</identifier>
                </dataModelID>
                <instance>
                    <uri>http://badc.rl.ac.uk/ndgWFS?uri=badc.nerc.ac.uk__CSML__granule0124</uri>
                    <format>CSML</format>
                    <instanceComment> CSML: The dataset is available at this URL via the NDG
                        WFS.This link is granule </instanceComment>
                </instance>
                <dgGranuleSummary>
                    <dgGranuleName> Daily Data Model Levels </dgGranuleName>
                    <dgParameterSummary>Stuff</dgParameterSummary>
                </dgGranuleSummary>
            </dgDataGranule>
        </dgDataEntity>
    </dgMetadatRecord>
</dgMetadata>

Non-NDG Records

We should simply parse all non-NDG records directly into a related URL, and parse them back out again preserving their content identically.

ISO19139

Then the question is: how should we do this in ISO19139?

In terms of the content model (ISO19115), the relevant pieces are:

  • DIF's entryID maps onto MD_Metadata/datasetURI (not really relevant to ServiceBinding, included for completeness)
  • All metadata records map onto between 0 and 1 MD_Distribution entities.
    • Which has 0 and many MD_DigitalTransferOptions
      • Which has 0 to many CI_OnlineResources
        • Which consists of a compulsory linkage and optional protocol, applicationProfile,name,description,function
          • of the latter could be download, information,offlineAccess
            • defined as "instructions for transferring data", "information about "
    • We may want to make use of the MD_Format
      • Which has compulsory name and version characterstrings plus an optional specification
  • Further, all metadata records have MD_Identification elements, and these include
    • Aggregations of MD_ServiceIdentification records (which takes us to ISO19119, and the OGC

profile of ISO19115+ISO19119 for CSW2.0).

Attachments

  • Draft Service Binding.xml Download (4.9 KB) - added by lawrence 12 years ago. This is the MOLES xml used in the example, available as a file to ease editing by using an xml editor