-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to store and serve metadata #34
Comments
A) MVPEach metadata file (related to one or more downloads) is saved as name-of-data.md in the _dataset folder. Therefore, adding a copy of the .md file into the relative dataset zip at the moment of upload is not optimal, as it could change later; and collate the md to zip at the moment of download is not easily achieveable on GH-JKAN. See also GFDRR/rdl-jkan#7 |
See how is possible to match our general attributes with Dublincore: https://en.wikipedia.org/wiki/Dublin_Core |
The contribution table now matches 13 of the 15 DublinCore basic attributes, although the RDL fieldname is different from DC.
2 DC attributes not yet used, which is not a problem since DC define all attributes as optional.
|
This whole discussion is superceeded by @ldodds and Jean work for aligning metadata to DCAT. |
Also see previous discussion: https://github.com/GFDRR/rdl-website/issues/80
This is a critical point for RDL, and has long being discussed on several occasions. I thought it could be useful to make the point on status and options with pros and cons, to further discuss with Leigh.
We need two types of solutions:
A) One quick solution for JKAN MVP (already prototyped)
Because there is no DB and the catalogue just links to S3 stored zip files, metadata file are stored in a text file within the zip. The file links to the JKAN schema which has been modified to mirror the RDL attributes; the same file can be read with text editor by user, which has to look at documentation for explanation of fields.
B) One optimal solution for next redesign
Points of discussions:
1. What is data-what is metadata
Not a trivial question; in the PostGRE, csv-based implementation of RDL, both schema attributes and data tables were stored and indexed together; there is no clear separation between data and metadata.
Moving towards a file-based approach but with dynamic DB, we are separating what is data (stored in S3) from what is metadata (stored in DB, indexed and searchable).
The metadata would be strictly mirroring the attributes of the RDL schema.
2. Adoption of standards for crearing and exchanging metadata
There are several examples of standard metadata profiles, and there's actually a standard on how to create them (ISO 19106). These standards can cover various levels and types of information. Here are also some examples for geo and non-geo metadata profiles/extensions in different domains: https://rd-alliance.github.io/metadata-directory/extensions/
It has been proposed to create our own ISO-based metadata profile for RDL as it was done for INSPIRE.
See section 4.1 in this paper for some examples including NAP and INSPIRE https://www.mdpi.com/2220-9964/8/6/280
Stu has produced a review of existing standards:
https://drive.google.com/file/d/1ksCfm4OVgwKUn50e2eQ56lgqQ1yx7vjq/view?usp=sharing
and tryied to match existing ISO metadata schema to key attributes of RDL:
https://drive.google.com/file/d/1Mtn2SRl8hSfemm1_-mjja89KSbnSE13p/view
Most GIS software will be able to read the "core" metadata, which has non-human, but also human-friendly elements (abstract, POC, license, etc). GFDRR invested quite a bit in making improvements on how metadata is handled in GeoNode and QGIS, to make the reading and editing more human-friendly - yet we are far from optimal implementation.
The preferred format of exchange is xml, which comes in the same zip as the shp or tif.
Example of ISO metadata file
Example of DCAT metadata file
Most often than not, risk layers from WB projects come empty of any -interesting- xml, meaning that only basic GIS info are stored. Only a fraction of this information is human-understandable from opening the file in browser.
--WIP--
The text was updated successfully, but these errors were encountered: