Version 13 (modified by selatham, 15 years ago) (diff)


Combined NDG Use Case with issues

What DataProviders wants from NDG2

  • Increase discovery and usage of RSDAS data, logged per user.
  • Allow visualisation and analysis of RSDAS data using generic tools.
  • Expand NDG-enabled datasets to include long time-series and near-real time data.
  • Provide a data discovery system for NOCS data that is “community supported”
  • Provide external access – especially visualisation - to NOCS data using “community” tools
  • Provide framework for development of (meta)data systems for NOCS that are compatible with NERC data centres
  • Gain access to support in developing tools for (semi-)automating metadata creation & benefit from NDG tools for expected future developments (conversion to ISO standard etc)
  • Provide an incentive for NOCS scientists to submit metadata & data to data management systems
  • Take advantage of NDG developments such as term servers to simplify generation of metadata
  • Possibility to use NDG as an “internal” tool for data discovery / transfer – to BODC!

What Data Users want from NDG

  • Ability to access NOCS data using NDG extraction and visualisation tools
  • Single point access to data at all NERC data centres (esp BODC, BADC, NEODC) using NOCS credentials

Use case of interaction with RSDAS data

  • User discovers RSDAS data on NDG portal, says wow I want to analyse that.
  • User logs in to another DP to get NDG credentials for accessing RSDAS data.
  • User browses around metadata DIF, MOLES, CSML.
  • User visualises time-series of satellite data using GeoSPLAT; compares with other datsets.
  • Peter at PML notices what data this user has accessed, as it appears in the log.
  • Scientist user writes program using Client Package (Python/Java?/IDL) to analyse RSDAS data.
  • User may contact PML for additional access permissions, via RSDAS application form.

Data provider procedure

Describe dataset metadata (MOLES & DIF creation)

  • Decide on scope of each discoverable dataset.
  • Assume we are generating MOLES first then CSML... (discuss).
  • Decide on granularity of DataGranule objects, ie how many CSML per MOLE record? (BADC current datasets need to be discoverable, but may have difficulty creating CSML)
  • Ensure all this is recorded somewhere accessible in DPs back-end metadata. 'somewhere accessible' probably means a database. (Not yet for BADC - big job)
  • Ensure all related metadata are recorded somewhere accessible in DPs back-end metadata, e.g. sensors, units, vocabulary keywords, activities, etc. (Not yet for BADC - big job)
  • Write/adapt software for automating output of MOLES from DPs back-end metadata (DB or wherever).Note this includes all MOLES object types with deployments.
  • Place MOLES records in ndg_B_metadata collection in an eXist db which is accessible to a MOLES Browse web service.
  • Automatically/dynamically generate DIF records from MOLES.
  • Place DIF records in Dlese OAI provider.
  • Review DIF/MOLES accuracy and iterate automatically maintaining the OAI record history. (Dlese current software seems to have problems when records are updated or deleted)

Describe dataset data (CSML creation)

  • Write CSML scanner/templates for new datasets.(Potentially will all have to do this for non-netCDF formats))
  • For future datasets - consider original data formats. Ideally can be converted to suitable format on-the-fly, e.g. netCDF.
  • Generate CSML records dynamically, from netCDF files, database, etc.
  • CSML records will need to use standard names. What if there aren't any available? Big problem for BADC back catalogue
  • Connect CSML to MOLES records via 'S' summary metadata which appears in both. Preferably do this in DPs back-end metadata.
  • Store CSML records somewhere accessible to other Web Services. eXist db?
  • Test CSML accuracy using NDG portal data browser.

NDG Procedure

NDG Discovery

  • New DataProvider tells NDG about their OAI records in NDG comnpliant format (currently DIf, but should be ISO)
  • NDG sets them up as an automatic harvest.
  • Automatically harvested records are automatically pre-processed to tidy remove OAI style filenames and any namespaces (which cause problems in eXist XQuery).
  • Pre-processed records are ingested into 'dif' collection in NDG Discovery eXist db. (currently only go into dev/glue. Should we have completely separate production OAI & ingest on superglue? Is an editorial process required? If not, Discovery portal must cope with any content.)

NDG Browse

Security and logging

  • Install NDG security software.
  • Generate role mappings with other NDG DP's.
  • Assign RSDAS users into external NDG roles.
  • Assign datasets to appropriate access role, e.g. any NDG user.
  • Interface NDG security with RSDAS data browser and authentication system.
  • Ensure NDG access to RSDAS data is logged: e.g. name, date, data granule id or filename.
  • Install NDG data delivery services and DX?
  • Test access to RSDAS data.

Data Delivery

  • Ensure there is a system for real time access to data held in archives. (Not currently at NOCS or BODC)


  • Install local GeoSPLAT?
  • Test delivery of netCDF files from RSDAS data.
  • Test visualisation of RSDAS data in GeoSPLAT.

Back to CompleteUseCases.