-
Notifications
You must be signed in to change notification settings - Fork 12
Proposal #1: custom metadata mapping
Emanuele Tajariol edited this page Apr 28, 2014
·
3 revisions
CKAN CSW harvester (implemented in the ckanext-spatial extension) extracts information from ISO19139 records using a wide but fixed set of xpath.
Site developers/admins may need to extract some other information that are not already mapped. It should be possibile to add new field mappings without the need to edit the python code.
The existing implementation that extracts data from the ISO records is split in two steps:
- a first step extracts bare information from the record, and you can find it here:
http://github.com/ckan/ckanext-spatial/blob/master/ckanext/spatial/model/harvested_metadata.py
At the end of the same file there are some methodsinfer_**
that try to extract single valued data from a set of nodes. - Second step is performed in the import stage; Method
get_package_dict()
in file filebase.py
http://github.com/ckan/ckanext-spatial/blob/master/ckanext/spatial/harvesters/base.py#L154
maps the extracted values into static andextra
fields in the CKAN dataset.
Configuration will optionally contain the extra_mappings
field.
It will be a map with these contents:
- key: the name of the extra field that will be created
- value: the xpath of the data that will be extracted
Note that an XPath extracts a nodeset from an XML document.
-
Text nodes only:
We'll only want to handle XPath expressions that extract text nodes. If the XPath does not select a text, the harvester may throw an error. -
Multiple values:
If more than one text node is selected by a single XPath, the correspondingextra
field will contain a list of strings encoded as a JSon array. -
Empty values:
If the XPath does not select anything, the extra field will be created as an empty string.