= Combined NDG Use Case with issues = == What !DataProviders wants from NDG2 == * Increase discovery and usage of RSDAS data, logged per user. * Allow visualisation and analysis of RSDAS data using generic tools. * Expand NDG-enabled datasets to include long time-series and near-real time data. * Provide a data discovery system for NOCS data that is “community supported” * Provide external access – especially visualisation - to NOCS data using “community” tools * Provide framework for development of (meta)data systems for NOCS that are compatible with NERC data centres * Gain access to support in developing tools for (semi-)automating metadata creation & benefit from NDG tools for expected future developments (conversion to ISO standard etc) * Provide an incentive for NOCS scientists to submit metadata & data to data management systems * Take advantage of NDG developments such as term servers to simplify generation of metadata * Possibility to use NDG as an “internal” tool for data discovery / transfer – to BODC! == What Data Users want from NDG == * Ability to access NOCS data using NDG extraction and visualisation tools * Single point access to data at all NERC data centres (esp BODC, BADC, NEODC) using NOCS credentials == Use case of interaction with RSDAS data == * User discovers RSDAS data on NDG portal, says wow I want to analyse that. * User logs in to another DP to get NDG credentials for accessing RSDAS data. * User browses around metadata DIF, MOLES, CSML. * User visualises time-series of satellite data using GeoSPLAT; compares with other datsets. * Peter at PML notices what data this user has accessed, as it appears in the log. * Scientist user writes program using Client Package (Python/Java/IDL) to analyse RSDAS data. * User may contact PML for additional access permissions, via RSDAS application form. == Data provider procedure == === Describe dataset metadata (MOLES & DIF creation) === * Decide on scope of each discoverable dataset. * Assume we are generating MOLES first then CSML... (discuss). * Decide on granularity of !DataGranule objects, ie how many CSML per MOLE record? ('''BADC current datasets need to be discoverable, but may have difficulty creating CSML''') * Ensure all this is recorded somewhere accessible in DPs back-end metadata. 'somewhere accessible' probably means a database. ('''Not yet for BADC - big job''') * Ensure all related metadata are recorded somewhere accessible in DPs back-end metadata, e.g. sensors, units, vocabulary keywords, activities, etc. ('''Not yet for BADC - big job''') * Write/adapt software for automating output of MOLES from DPs back-end metadata (DB or wherever).'''Note this includes all MOLES object types with deployments.''' * Place MOLES records in ndg_B_metadata collection in an eXist db which is accessible to a MOLES Browse web service. * Automatically/dynamically generate DIF records from MOLES. * Place DIF records in Dlese OAI provider. * Review DIF/MOLES accuracy and iterate automatically maintaining the OAI record history. ('''Dlese current software seems to have problems when records are updated or deleted''') === Describe dataset data (CSML creation) === * Write CSML scanner/templates for new datasets.('''Potentially will all have to do this for non-netCDF formats)''') * For future datasets - consider original data formats. Ideally can be converted to suitable format on-the-fly, e.g. netCDF. * Generate CSML records dynamically, from netCDF files, database, etc. * CSML records will need to use standard names. '''What if there aren't any available? Big problem for BADC back catalogue''' * Connect CSML to MOLES records via 'S' summary metadata which appears in both. Preferably do this in DPs back-end metadata. * Store CSML records somewhere accessible to other Web Services. '''eXist db?''' * Test CSML accuracy using NDG portal data browser. == NDG Procedure == === NDG Discovery === * New !DataProvider tells NDG about their OAI records in NDG compliant format ('''currently DIF, but should be ISO''') * NDG sets them up as an automatic harvest. * Automatically harvested records are automatically pre-processed to tidy remove OAI style filenames and any namespaces ('''which cause problems in eXist/XQuery'''). * Pre-processed records are ingested into 'dif' collection in NDG Discovery eXist db. ('''currently only go into dev/glue. Should we have completely separate production OAI & ingest on superglue? Is an editorial process required? Can we check links work? If not, Discovery portal must cope with any content.''') * NDG Discovery Web Service and maybe NDG GUI are used to Discover datasets. ('''NDG Discovery Web Service is broken! Currently using non-WS service to an old db''') * Indication of access constraints can be seen at this point.('''If access constraints have been populated correctly. It is a common problem where people say access_constraints = none, like an empty tag.''') === NDG Browse === * Users select discovered datasets for browsing * The Browse Web Service retreives a MOLES stub-B record. Possibly displayed by NDG Browse GUI.('''GUI needs extension''') * User can browse the links to other MOLES objects (abiding by security constarints), back to DIFs, other URLs. * User can view or download XML documents. * User can select data granules. * User history is collected. * '''Where does Browse software ''need'' to be installed? and what?''' === Security and logging === * Install NDG security software. * Generate role mappings with other NDG DP's. * Assign DPs users into external NDG roles. * Assign datasets to appropriate access role, e.g. any NDG user. * Interface NDG security with DPs data browser and authentication system. * Ensure NDG access to DPs data is logged: e.g. name, date, data granule id or filename. * Test access to DPs data. === Data Extractor (Data Browse) === * '''Where do NDG DX services ''need'' to be installed? === Data Delivery === * '''Where do NDG data delivery services ''need'' to be installed? * Ensure there is a system for real time access to data held in archives. ('''Not currently at NOCS or BODC''') === Visualisation === * Install local GeoSPLAT? * Test delivery of netCDF files from RSDAS data. * Test visualisation of RSDAS data in GeoSPLAT. Back to CompleteUseCases.