Skip to content

Proposal #1: custom metadata mapping

Emanuele Tajariol edited this page Apr 28, 2014 · 3 revisions

Intro

CKAN CSW harvester (implemented in the ckanext-spatial extension) extracts information from ISO19139 records using a wide but fixed set of xpath.

Site developers/admins may need to extract some other information that are not already mapped. It should be possibile to add new field mappings without the need to edit the python code.

Existing implementation notes

The existing implementation that extracts data from the ISO records is split in two steps:

Proposal

Configuration will optionally contain the extra_mappings field.
It will be a map with these contents:

  • key: the name of the extra field that will be created
  • value: the xpath of the data that will be extracted

Note that an XPath extracts a nodeset from an XML document.

  • Text nodes only:
    We'll only want to handle XPath expressions that extract text nodes. If the XPath does not select a text, the harvester may throw an error.
  • Multiple values:
    If more than one text node is selected by a single XPath, the corresponding extra field will contain a list of strings encoded as a JSon array.
  • Empty values:
    If the XPath does not select anything, the extra field will be created as an empty string.
Clone this wiki locally