Commit Graph

67 Commits

Author SHA1 Message Date
amercader b428c33ff6 [#69] Add config option to keep old behaviour (not reindex) 2014-06-10 18:08:38 +01:00
amercader dbf139e732 [#63] Added extension point for defining custom validators 2014-05-13 18:07:14 +01:00
amercader 6c55aad223 [#63] Add extra stuff to the get_package_dict extension point
Moved the call to get_site_user higher on base.py so it's available to
extensions. Also added the parsed XML etree so it does not need to be
parsed from the string again.
2014-05-13 18:03:12 +01:00
amercader 5461bebb62 Merge branch 'master' into 63-extend-spatial-harvesters 2014-05-13 17:54:44 +01:00
amercader fa4161af87 [#70] Fix deletion of harvested CSW records
The object id was pushed to the list returned by gather_stage before
being saved on the db, so None was added, causing an exception in the
Redis queue
2014-05-13 17:53:18 +01:00
amercader b25a01029a Merge branch 'master' into 63-extend-spatial-harvesters
Conflicts:
	ckanext/spatial/harvesters/base.py
2014-05-13 15:33:04 +01:00
amercader ab241d2530 Pass defer_commit in context on get_site_user calls
See ckan/ckan#1714. Until that is fixed properly, the `defer_commit`
flag avoids some `DetachedInstanceErrors` happening during the
harvesting.
2014-05-13 15:30:34 +01:00
amercader 57b7e51e5a Merge branch '69-reindex-dataset-no-object-change' into 63-extend-spatial-harvesters 2014-04-30 18:02:15 +01:00
amercader e979d08e77 [#69] Reindex dataset if harvest object did not change
We replace the old harvest object with the new one, and if we don't
reindex the reference to the old harvest object will remain in the
dataset dict
2014-04-30 18:01:42 +01:00
amercader 0513e360e9 [#63] Add previous_object check
In rare cases (eg if there was a previous error of two objects sharing
a guid) we can have a "changed" state and no previous_object
2014-03-19 12:46:01 +00:00
amercader 119c0fd40c [#63] Add user to delete context to avoid exception 2014-03-19 12:45:49 +00:00
amercader 13f03878e2 Merge branch 'master' into 63-extend-spatial-harvesters 2014-03-14 12:36:36 +00:00
amercader badd723259 [#63] Add new ISpatialHarvest interface
Two extension points: ``get_package_dict`` and ``transform_to_iso``,
with the same expected behaviour as the old hooks meant to be overriden.

For ``get_package_dict`` we now pass, apart from the generated
package_dict, the parsed iso_values and the harvest object.

Updated docs and added autodocs.
2014-03-14 11:30:26 +00:00
etj 9116a6fd1f [#55] Allow CSW harvesters to define CQL filters (2nd try) 2014-03-02 23:12:54 +01:00
fxia a19010d8e5 for progress, use gemini_values not iso_values 2013-09-10 10:57:15 -04:00
fxia c1fe37647f change progress multiplicity to * 2013-09-09 14:47:14 -04:00
fxia a9414e755d add progress into the iso values 2013-08-29 00:05:24 -04:00
amercader c6e29ee25f [#35] Ignore auth when using site_user 2013-08-14 12:23:00 +01:00
kindly add78d5931 allow csw to fetch differen output schema 2013-05-18 18:28:25 +01:00
amercader 27521221d6 [#20] Flag datasets created via the spatial harvesters via a generic extra 2013-05-15 16:58:12 +01:00
amercader 8e81d1bd69 [#19] Extract thumbnail from ISO documents 2013-05-15 16:41:36 +01:00
amercader b4a7cf2289 [#15] Reenable the Solr backend on master
It can be used against CKAN core master (eventually 2.1)
2013-05-14 14:34:10 +01:00
amercader ce8747198f Merge branch 'release-v2.0'
Conflicts:
	README.rst
	ckanext/spatial/commands/validation.py
	ckanext/spatial/controllers/api.py
	ckanext/spatial/harvesters/gemini.py
	ckanext/spatial/plugin.py
	ckanext/spatial/tests/lib/test_spatial.py
	ckanext/spatial/tests/model/test_harvested_metadata.py
	ckanext/spatial/tests/test_harvest.py
2013-05-14 14:02:28 +01:00
amercader 83d903f84f Revert "[#15] Ensure that bounding boxes are defined counter-clockwise"
Reverting #15 as CKAN 2.0 does not include the necessary changes.

This reverts commit fede0b0831.
2013-05-13 18:55:27 +01:00
amercader 45f4f4da57 [#16] Ignore time zones when parsing harvest object modified date
Otherwise you get this exception when the date on the document has time
zone information, as dates as stored without it on the database:

TypeError: can't compare offset-naive and offset-aware datetimes
2013-04-25 17:13:03 +01:00
amercader 51a2b20501 Merge branch '15-solr-based-spatial-search' into release-v2.0 2013-04-12 10:56:21 +01:00
amercader 822ddbb1b5 [#8] Don't add object id to error so it can be aggregated 2013-04-12 10:54:02 +01:00
amercader 970dfd1b68 Merge branch 'release-v2.0' into 15-solr-based-spatial-search 2013-04-11 12:51:08 +01:00
kindly d1594b3790 do not use kwarg for unicode errors 2013-04-09 12:06:38 +01:00
amercader 65e056d519 Merge branch 'release-v2.0' into 15-solr-based-spatial-search 2013-04-03 12:14:20 +01:00
amercader ff25ff2f2b [#8] Abort import stage if get_package_dict returns nothing 2013-04-02 18:40:03 +01:00
amercader 0c98e6ec4c [#8] Minor fix in single doc harvester 2013-03-27 17:38:42 +00:00
amercader fede0b0831 [#15] Ensure that bounding boxes are defined counter-clockwise
To return correct results on a spatial query, rectangle geometries must
be defined in counter-clockwise order [1]. This changeset adds a small
sanity check to before_index when we are dealing with a Polygon geometry
that has 5 coordinate pairs. Shapely is used to generate a LinearRing
from the polygon coordinates and check if they are ccw. If not, they are
reordered and a new polygon is generated so the WKT sent to Solr is
properly ordered.

The GeoJSON template used for extents in the base spatial harvester has
been also updated to define the coordinates counter-clockwise.

[1]
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4#JTS_.2BAC8_WKT_.2BAC8_Polygon_notes
2013-03-23 19:28:31 +00:00
amercader 40967385b0 [#8] Fix typo in WMS format detection 2013-03-21 17:53:38 +00:00
amercader 0e0b5a2cc2 [#8] Fix bug that prevented setting a default resource name 2013-03-21 14:34:42 +00:00
amercader 627c4c58e0 [#8] Fix bug that prevented setting a default resource name 2013-03-21 14:34:23 +00:00
amercader a7fc19768b [#8] Remove print commands from WAF harvester 2013-03-14 17:45:34 +00:00
amercader eb201e1759 [#8] Waf harvester: improve exception and return empty list if no records 2013-03-14 17:35:51 +00:00
amercader 0aafffc8dc [#8] Capture exceptions during request in WAF harvester 2013-03-14 14:56:42 +00:00
amercader d2723c3020 [#8] resource-type not always present 2013-03-14 14:30:16 +00:00
amercader a76a8d2ca7 [#8] Don't use object id so messages can be grouped 2013-03-14 12:37:35 +00:00
amercader 4638b3899f Revert "[#8] Don't use object id so messages can be grouped"
This reverts commit 032cc4d961.
2013-03-14 12:36:39 +00:00
amercader 032cc4d961 [#8] Don't use object id so messages can be grouped 2013-03-14 12:34:20 +00:00
amercader da1dc02c7e [#8] Improve fields returned in the package dict
Make them less uklp specific and more parse friendly. Helper functions
should be used in the UI to format them nicely.
2013-03-08 18:57:30 +00:00
amercader 724ef6ed7c [#8] Fix gemini harvester after change in spatial field 2013-03-08 18:56:03 +00:00
amercader e7f70c4f85 [#8] Fix KeyError in point template 2013-03-05 18:38:55 +00:00
amercader 7c5071bfc2 [#8] Sanitize bbox before creating spatial extra
Some common problems:
* Whitespace, tabs, line feeds and plus signs: should be handled by
  float()
* Text: log error and skip creation of spatial extra
* Same set of 2 coords for extent: create point instead of polygon

Note that the bbox values are stored as they are in the bbox-xx-yy
extras
2013-03-05 18:31:49 +00:00
amercader d158a6c684 [#8] Change 'Resource locator' string for unnamed resources
'Resource locator' was confusing, has been replaced by 'Unnamed
resource' and made translatable. Also don't set description if not
present, set name.
2013-03-04 17:55:34 +00:00
amercader d43cbb8800 [#8] Improve resource format detection
The 'guess_resource_format' function looks for common patterns in popular
geospatial services and file extensions. It just looks at the provided URL,
it does not attempt to perform any remote check. By default, it will use the
mimetypes module if no match was found before to try to guess the format.

On the previous version, all resources in documents of type 'service' were
queried to see if they were actually WMS. This is no longer the case,
but services flagged as 'wms' can be verified if the following setting
is set to True: ckanext.spatial.harvest.validate_wms
2013-03-04 17:44:18 +00:00
amercader da5b37bc45 [#8] Fix typo in GeoJSON template 2013-03-01 17:42:51 +00:00