Some common problems:
* Whitespace, tabs, line feeds and plus signs: should be handled by
float()
* Text: log error and skip creation of spatial extra
* Same set of 2 coords for extent: create point instead of polygon
Note that the bbox values are stored as they are in the bbox-xx-yy
extras
The 'guess_resource_format' function looks for common patterns in popular
geospatial services and file extensions. It just looks at the provided URL,
it does not attempt to perform any remote check. By default, it will use the
mimetypes module if no match was found before to try to guess the format.
On the previous version, all resources in documents of type 'service' were
queried to see if they were actually WMS. This is no longer the case,
but services flagged as 'wms' can be verified if the following setting
is set to True: ckanext.spatial.harvest.validate_wms
Some improvements on the endpoints that return the contents of the
harvest objects:
* Nicer URLs with redirects to the old ones
* Returning the raw harvest object content is available on the main
harvest extension, so just redirect there
* Support for showing the original document of a harvest object, if
present
* Suport for defining a custom XSLT for the HTML view, via
ckanext.spatial.harvest.xslt_html_content
ckanext.spatial.harvest.xslt_html_content_original
Some improvements on the endpoints that return the contents of the
harvest objects:
* Nicer URLs with redirects to the old ones
* Returning the raw harvest object content is available on the main
harvest extension, so just redirect there
* Support for showing the original document of a harvest object, if
present
* Suport for defining a custom XSLT for the HTML view, via
ckanext.spatial.harvest.xslt_html_content
ckanext.spatial.harvest.xslt_html_content_original
The import CLI reruns the import stage for the last current objects, so
when running it, the previous objects don't need to be changed. Any
date check is overridden to force the update of the package.
This can be set instance wide on the ini file with
ckanext.spatial.harvest.continue_on_validation_errors
or per source, adding continue_on_validation_errors=true to the source
config.
You don't need to create a 'harvest' sysadmin user any more.
By default this will be the internal site admin user. This is the
recommended setting, but if necessary it can be overridden by
the `ckanext.spatial.harvest.user_name` config option, eg to
support the old hardcoded 'harvest' user
This can be overridden by custom harvesters willing to support non ISO
formats (like FGDC). It is called whrn the original_document and
original_format harvest object extras are present. Custom harvesters are
responsible for transforming the original document to ISO.
Prior to the merging of the new spatial harvesters, the existing ones
based on Gemini and UKLP have been moved to their own namespace
(ckanext.spatial.harvesters.gemini). The plugin points have been updated
so users currently using these harvesters will still be able to use them
as normal.
The base harvester (SpatialHarvester) has been updated with new methods,
most significally '_get_package_dict' and 'import_stage'. Note that
SpatialHarvester now extends HarvesterBase on ckanext-harvest, which had
some of its methods updated.
TODO: still some geo.data.gov specific bits!
Changes in multiplicity to support the ISO 19115 spec rather than just
the Gemini 2 one. Thanks to @dread for his help on this.
Summary of the changes:
* dataset-reference-date: Set to 1..*
Note that there was a bug with mutliple values allowed per date.
Returned object should now be like:
"dataset-reference-date": [{"type": "creation", "value": "2004-02"},
{"type": "revision", "value": "2006-07-03"}]
* metadata-languge: Set to 0..1
* resource-type: Set to *. That means that a list is now returned
* bbox: Set to *. Note that bboxes are now returned as objects such as:
[{"north":xxx, "south":xxx,, "east":xxx, "west":xxx}, {"north":xxx,
"south":xxx,, "east":xxx, "west":xxx}]
The existing Gemini based harvesters and validators have been adapted,
all tests pass.
Add new mandatory fields when creating sources, status dict has new
keys, CKAN lower cases formats, take into account harvest source
datasets.
Added a local getcapabilities response to avoid remote 404s.
Note that the TestValidation tests need to be fixed, as 27c4ee81e
removed the validation from the gather stage.
- Added a config option ('ckanext.spatial.use_postgis_sorting') to
activate this as this behaviour will be deprecated in the future
in favour of Solr 4 spatial sorting capabilities.
Also fixed the tests
Conflicts:
ckanext/spatial/plugin.py