wiki:DiscoveryComponents

Version 5 (modified by mpritcha, 9 years ago) (diff)

--

Discovery Components (DRAFT)

Introduction

This guide aims to describe in simple terms the components of the discovery service.

Last updated 2010/04/08 by Matt Pritchard

Components

  • Discovery ingest
  • Discovery index database
  • Discovery web service (API)
  • Discovery front end (portal)
  • Vocab server
  • Data providers OAI info editor

(Diagram TODO)

Overview

Data Providers create documents describing data resources. These documents are metadata records, and are "published" by each data provider to make them available for others to access. An automatic process gathers these documents from each data provider and puts them into a database where they are stored alongside those from other data providers. A web service carries out searches of this database in response to search requests received from a search interface, and returns results back to the search interface, for presentation to the user. Search tools included in the search interface help the user construct search requests based on time periods, geographic areas and text terms from standard vocabularies.

Definitions

Data provider
Organisation (e.g. NERC data centre) that produces metadata records and publishes them via OAI.
Data resources
Things described by metadata records
Publishing
The act of putting metadata records in a system that exposes them for external access over the internet. This is done using OAI, a software toolkit installed at each Data provider site. The "OAI Provider" function of this software simply exposes a collection of metadata records in a standard way, ready for harvesting. Each data provider is in control of his/her OAI Provider software and should register the details of their "node" using the "OAI Providers' interface".
OAI Providers Interface
A web-based tool for Data providers to enter the details (URL plus some other configuration options) of their "node", so that the automated harvesting process knows where to go to harvest metadata records.
Harvesting
A process by which metadata records are collected (via OAI-PMH : Open Archives Initiative Protocol for Metadata Harvesting) centrally from all participating data providers.
Discovery Index Database
Harvested metadata records are processed centrally and added to the discovery index database, which stores the documents in their entirety (to enable full-text searching), but also pulls out pre-defined fields within them to enable specific types of searhes (e.g. spatial extent, time periods). The database is held as a set of relational database tables within a database server.
Web service

Attachments