Ticket #1074 (closed task: fixed)

Opened 10 years ago

Last modified 9 years ago

MEDIN format document ingestion

Reported by: mpritcha Owned by: sdonegan
Priority: blocker Milestone: MEDIN
Component: discovery Version:
Keywords: discovery medin Cc: steve.donegan@…

Description (last modified by mpritcha) (diff)

Require support for MEDIN ISO format in code used for ingestion of harvested documents to discovery index system. Requires:

  • Identification of elements in MEDIN format which are equivalent to those in a DIF.
  • Map these to columns of discovery index
  • Update ingest code to implement SQL to populate these.

Change History

comment:1 Changed 10 years ago by mpritcha

  • Owner changed from awoolf to sdonegan
  • Component changed from CSML to discovery

comment:2 Changed 10 years ago by mpritcha

  • Description modified (diff)

comment:3 follow-up: ↓ 4 Changed 10 years ago by jdoughty

Steve, Matt Does this infer that the DDS will only be populated with the fields that match between DIF and ISO? I assumed that the entire metadata record will be loaded into the DDS in some format so that the record can be searched in its entirety by the MEDIN Portal. James

comment:4 in reply to: ↑ 3 Changed 10 years ago by mpritcha

  • Cc steve.donegan@… added

Replying to jdoughty:

Steve, Matt Does this infer that the DDS will only be populated with the fields that match between DIF and ISO? I assumed that the entire metadata record will be loaded into the DDS in some format so that the record can be searched in its entirety by the MEDIN Portal. James

The entire metadata record will be loaded into the discovery index database, enabling a full-text search of the document. But we need to know what specific fields MEDIN need to be able to search against so that we can index these & enable specific searches against these fields. Since the ISO format (using MEDIN as the prime candidate) will be the main format in use (at least in due course), the mapping between MEDIN ISO fields and the discovery index db needs to be worked out, just as we previously worked out a mapping between DIF and this (internal) db format. That's my understanding anyway. Matt.

comment:5 Changed 9 years ago by sdonegan

  • Status changed from new to closed
  • Resolution set to fixed

MEDIN format ISO can now be ingested into the MEDIN discovery database. A list of extra fields to add (and to form basis of new targeted searches) has been established and added to the database. We have ensured that these searchable fields can also be returned as text to support requested additional information in API return results.

Note: See TracTickets for help on using tickets.