wiki:Discovery

Discovery

(See the OldDiscovery page for NDG2 era information.)

The main aim of the NDG3 discovery activity is to improve the functionality of the NDG discovery service (and fix any bugs involved in improving the speed and reliability of the discovery service which occurred under the auspices of the NERC portals project funding).

Meetings

October 01, 2008

The discovery work will be done by Steve D.

here? are notes on new CEDA services

Expect 60 days work between now and the end of March. To be confirmed. (Matt/Steve??).

The following activities are in scope, but prioritisation and job quantification has yet to occur (it is unlikely that all will be doable).

  • Implementing spatial proximity for bounding box searches as the default.
  • client support for all the options provided by the server.
  • introducing logging:
    • all searches to be properly logged.
    • all outbound links to be modified so that they are logged before redirection.
      • consider whether this information could be used for search ranking (this would involve some code to parse the logs, and update the dataset entries some sort of link hit count, presumably daily. Would need to ensure robots were ignored).
  • Improve vocabulary server calls for multiple parameters.
  • Migrate to NERC ISO as a/the main format (depends on Andrew's ISO)
  • Implement an OGC CSW and/or an OpenSearch? interface.

Work plan

The following classifications have been made for identifying the tasks needed to complete MSI Discovery. The main guide is available as a Microsoft Project Plan (available here). I have classified existing tickets according to this plan, which will govern the sequencing of tasks needed to complete the Discovery upgrades.

DI-0: MSI Project Preparation

  • DI-0-1: Upgrade proGlue to baseline
  • DI-0-2: MSI Work planning and preparation

DI-1: Improve Logging Framework

  • DI-1-1: Understand current logging system
  • DI-1-2: Create Logging database
  • DI-1-3: System to monitor links followed
  • DI-1-4: Create interface for database
  • DI-1-5: Create use logging for monitoring

DI-2: Upgrade Discovery backend functionality

  • DI-2-1: Incorporate info from logs into searches
  • DI-2-2: Add/Improve? vocab interaction to searches
  • DI-2-3: Improve geospatial functionality
  • DI-2-4: Transfer to ISO as workhorse format
  • DI-2-5: Update to MOLES v2/v3
  • DI-2-6: Improve message handling (SOAP etc) to clients
  • DI-2-7: Handle ranking within resultset
  • DI-2-8: Check and update Xqueries
  • DI-2-9: Develop other service interface layers

DI-3 Upgrade Metadata Ingestion Functionality

  • DI-3-1: Upgrade ISO support
  • DI-3-2: Upgrade MOLES2/3 support
  • DI-3-3: Update Ingest scripts to allow increased functionality
  • DI-3-4: Allow providers greater control of harvesting

DI-4: Upgrade Discovery Portal front-end Functionality

  • DI-4-1: Discovery Search Upgrade
  • DI-4-2: Discovery Resultset Upgrade
  • DI-4-3: Liase with BODC over "look and feel" of portal once functionality finalised

Current Tickets

NDG3 Tickets

#94
[M] (DI-4-) Guidelines on how to use Web services to add discovery to anyones portal
#137
(DI-4-) (DI-2-3) Allow more complex spatial queries by using multiple coordinate systems.
#139
(DI-3-4) Have a ‘harvest now’ web page to run OAI harvest then NDG ingest records from a DP.
#336
[M] [WG] (DI-4-1) Service Binding Metadata (creation)
#442
[M] (DI-3-3) Link checker for harvested material
#443
[M] (DI-3-3) Parse access constraints a bit more carefully
#444
[M] [WG] (DI-4-) Better branding for data providers (icons in browse etc)
#563
[M] (DI-4-1) (DI-2-) Discovery - doFullTextSearch (and others) can do "order-by"
#598
[M] (DI-3-3) Handle deletions in OAI harvesting process
#600
[M] (DI-3-3) Automatically notify DP's when their OAI has failed.
#607
[M] (DI-2-) Ensure tape backups are correct for glue and superglue.
#653
[M] (DI-3-3) Cope with URL's etc in discovery identifiers
#655
[M] (DI-2-8) Moles-to-MDIP transform needs extending
#659
[M] (DI-2-8) Arbitrary online resources should not be mapped to NDG-B service URLs
#662
[M] (DI-2-6) Discovery - doPresent handling of missing / wrong namespaces
#677
[M] (DI-3-2) provenance entry in MOLES output
#697
[M] (DI-2-3) (DI-4-1) Revisit the pgsphere postgis issue
#734
[WG] (DI-4-1) Discovery Improvements - Sort by options on the result set page
#739
[M] (DI-3-1) Author search bugs
#746
[M] (DI-2-4) ISO19115/19139 support in discovery ingest
#762
(DI-4-1) Display popup box instead of tooltips on discovery/browse icons
#797
[WG] - (DI-4-1) Discovery Improvements - Results page improvements 2
#798
[WG] (DI-4-1) Discovery improvements - Catalogue (D) page
#822
[D] (DI-2-3) Handle multiple bounding boxes /coverages in Discovery
#841
(DI-2-3) Bounding box display inconsistency
#889
[M] (DI-3-3) Pre-ingest validation of harvested records
#899
(DI-4-) Discovery review - Web page title, keywords and description
#900
(DI-4-1) Discovery review - Search page help text
#903
(DI-4-2) Discovery review - results page help
#917
(DI-4-) Orphaned Pylons templates
#919
[WG] (DI-4-1) (DI-2-) Confirm that temporal searching does what we think it does
#921
[WG] (DI-4-2) (DI-3-)Portal needs to summarise the sites from which harvests have occurred
#934
(DI-2-) (DI-3-) (DI-4-) semantic search returning 'server error'
#935
(DI-3-3) (DI-2-8) moles2dif xquery is still constructing a browse type URL
#947
(DI-4-1) Temporal search without Stop_date isn't supported in the DiscoveryService
#948
(DI-4-1) Search default setting on 'Source data provider'
#949
(DI-4-2) Refine search - settings not retained
#950
(DI-4-1) Search temporal coverage, improvement for usability
#952
(DI-2-) (DI-4-1) Displaying the semantic search options
#954
(DI-3-3) NERC DDS exist backup warnings
#959
(DI-4-2) Unable to view the 31st result via the next/previous links
#962
[M] (DI-3-3) Discovery ingest can't cope with vertical bar character
#971
(DI-3-3) DIF records visible through ndgbeta contain invalid tag <End_Date>
#972
(DI-2-8) DIF records contain incorrectly capitalised <Address> elements
#976
(DI-3-3) OAI Harvesting script not compatible with python2.5 installation on Glue
#1012
(DI-3-3) Problem with MDIP/DIF conversion code
#1016
(DI-1-2) Create logging database
#1018
(DI-1-4) Create interface to allow DiscoveryBE to query logging DB
#1019
(DI-1-3) Design system to report URL's followed to logging db.
#1020
(DI-1-5) Implement discovery service stats
#1021
(DI-2-1) Add logging info into DiscoveryBE searches
#1022
(DI-2-2) Add/improve vocab interaction in DiscoveryBE searches
#1024
(DI-2-4) Update to ISO as workhorse format in DiscoveryBE
#1025
(DI-2-4) Adjust ingest to handle ISO
#1026
(DI-2-5) Update to MOLES v2/v3
#1027
(DI-2-6) Design and finalise SOAP for increased functionality
#1028
(DI-2-6) Update WSDL to reflect updated backend
#1030
(DI-2-8) Check and update all xqueries used
#1031
(DI-2-9) Produce openSearch interface for the DiscoveryBE
#1032
(DI-2-9) Implement a GeoNetworks/CSW interface to the DiscoveryBE.
#1033
(DI-3-1) Upgrade ISO support in Ingest
#1034
(DI-3-1) Upgrade ISO handling in BE Database..
#1035
(DI-3-2) Upgrade MOLES v2/v3 support
#1036
(DI-3-3) Update ingest scripts to allow increased functionality
#1037
(DI-3-4) Allow OAI providers to synchronise records
#1038
(DI-4-1) Update Discovery Portal with increased functionality (place-holder)
#1039
(DI-4-2) Discovery resultset upgrade (place holder)
#1069
Make sure orderBy functionality matches required use cases
#1073
Enable OAI harvesting of MEDIN format documents
#1080
Support for structured search
#1081
Facilitate search by geographic name : option 1
#1082
Support fuzzy spelling
#1083
Support ontological matching
#1086
Produce updated WSDL and documentation
#1104
OAI Info Editor App Broken Security Plugin


NDG3: Capability, Discovery, Vocab, Software, MOLES, Security, Community, Roadmap, Management