Commit Graph

576 Commits

Author SHA1 Message Date
David Read 1a680f3fd3 [#158] Fix spaces encoding broken in previous merge. Tested with data.gov.uk. 2015-10-29 17:31:04 +00:00
David Read e2ab9e58e7 Merge remote-tracking branch 'origin/master' into 157-version-three-apify
Conflicts:
	ckanext/harvest/harvesters/ckanharvester.py
2015-10-28 14:34:27 +00:00
David Read 3f74c29c99 Merge branch 'master' into 157-version-three-apify 2015-10-27 17:45:27 +00:00
David Read 55245b5091 [#158] PEP8/formatting. 2015-10-27 17:43:11 +00:00
David Read 2a79873855 [#158] Use package search to get all datasets. Add paging search results. Store pkg_dict from search in the object rather than request it again in fetch_stage. 2015-10-27 17:33:22 +00:00
amercader 86630adab7 Merge branch 'include-exclude-org' 2015-10-27 15:52:55 +00:00
David Read b56fae8aed Fixes and tests
* Fix extras as a list of dicts
* Fix SOLR dates syntax - needed a Z
* Basic tests for this updated ckan harvester
* Now require CKAN 2.0 to be able to be able to save these packages in package_show form. Take advantage of this now we are such various imports from are definitely available, such as munge_tag.
* Add back compatibility for other harvesters supplying restful-like package_dicts to _create_or_update_package

TODO add back in the ability to harvest pre 2.0 CKANs with the RESTful calls (fallback or maybe configurable)
2015-10-23 17:30:28 +00:00
amercader 24574f485b Setup harvest model in harvester tests 2015-10-23 15:43:01 +01:00
David Read caeeace8dc Merge branch 'master' into 157-version-three-apify 2015-10-23 14:39:48 +01:00
David Read bc49149d5e Merge branch 'master' into include-exclude-org 2015-10-23 14:36:53 +01:00
amercader 2f4adfb338 Merge branch 'tests' 2015-10-23 13:18:15 +01:00
amercader 3c6cc55be0 Only flush keys on the current Redis database 2015-10-23 11:52:22 +01:00
amercader fdbade465f Merge branch 'master' into purge 2015-10-23 11:33:43 +01:00
amercader d950b13400 Merge branch 'unique-names-improved' 2015-10-23 11:02:49 +01:00
amercader 501edffe2d Merge branch 'master' into migration-states 2015-10-23 10:59:04 +01:00
David Read 3e4a9933ce Remove prints. 2015-10-21 16:52:19 +00:00
David Read dc7af5d150 Remove prints. 2015-10-21 16:38:03 +00:00
David Read eb9aa17862 Include/exclude orgs funcationality based on work by memaldi and ross. 2015-10-21 16:33:16 +00:00
David Read f70c16bce7 Add framework for testing harvesters. Modernize existing tests. 2015-10-21 16:26:57 +00:00
David Read d1f84295f8 purge_queues command now has warning about impact of Redis flushall, plus add some (log) output when you run a purge. 2015-10-21 16:12:40 +00:00
David Read 6360681a8f [#105] Fix order of deletes, as agreed with @florianm. 2015-10-12 15:57:27 +01:00
David Read 82bdff2f34 Add tests 2015-10-01 17:59:17 +01:00
David Read be3e88086a Generating unique names improved
* Harvesters that change the name when the title changes have had a
  problem when the change is small and a number was unnecessarily
  appended. e.g. "Trees "->"Trees" meant _gen_new_name("Trees") returned
  "trees1". Now you can specify the existing value and it will return
  that if it still holds.
* Maximum dataset name length is now adhered to.
* To make a name unique, a sequential number is now added, since for
  users that is more understandable and pleasant. However hex digits are
  still an option, for those that want to harvest concurrently.
2015-10-01 17:53:03 +01:00
Ross Jones 6dd40bfcf9 Changes the gather state to use v3 API
Rather than using the revisions in v2 API this now uses the
package_search API so that we can extend it with proper filters in
future.
2015-09-10 18:53:16 +01:00
Florian Mayer a6cdda0a14 set max version to 2.4.99 2015-08-19 08:41:42 +00:00
florianm 1905caa961 upgrade harvest_source_clear to not delete from authz models removed in migration 078 2015-08-19 10:25:20 +08:00
David Read 46f7b32b04 Merge branch 'master' of github.com:okfn/ckanext-harvest into migration-states 2015-07-22 10:13:55 +01:00
David Read 2da918c2e4 Fix migration for old harvests so that ones that errored are correctly marked. Added helpful comments in model. 2015-07-22 10:13:02 +01:00
Stefan Oderbolz ab76830e85 [#145] Throw + catch a custom exception if there are no jobs to run
If there are no harvesting jobs to run, there was always an ugly
exception message when using the paster command. This replaces the ugly
output with a proper message and uses a custom exception to allow others
to deal with this error differently.
2015-07-20 18:41:50 +02:00
Stefan Oderbolz 83dd0b4b68 [#138] Add data attributes to support timezone conversion 2015-07-09 22:35:54 +02:00
Stefan Oderbolz 4dc2f7367d [#139] Delete package relationships when clearing a harvest source 2015-06-26 17:20:23 +02:00
amercader 88d9ba0397 [#136] Fix broken RabbitMQ queue names
The harvester command was still using the old ones.
Use specific ones for testing.
2015-06-11 13:56:22 +01:00
amercader 673dfc9882 [#127] Use site user on the CKAN harvester
Add missing call
2015-06-11 10:38:33 +01:00
amercader d3a3f09ad1 [#127] Use site user on the CKAN harvester
To avoid having to create a 'harvest' sysadmin explicitly. It will still
be used if present, but if not the site user will be used. You can also
define to user to use via a config option.
2015-06-11 10:19:07 +01:00
amercader b17c3269b5 Merge branch 'clear-command' of https://github.com/metaodi/ckanext-harvest into metaodi-clear-command 2015-06-10 15:32:37 +01:00
Stefan Oderbolz 64ff0f3a3a Use single quotes to be consistent 2015-06-10 16:22:04 +02:00
Stefan Oderbolz 2a2d85f60c Wording changes for clearsource and rmsource 2015-06-10 16:19:23 +02:00
joetsoi 92b93c53fc add some translation strings 2015-06-10 12:14:20 +01:00
Stefan Oderbolz 8ebb843052 Add documentation for clearsource command 2015-06-10 11:29:24 +02:00
Stefan Oderbolz 61bc150ae6 Expose clear harvester source as a paster command 2015-06-10 11:19:10 +02:00
amercader 9f8aae3a18 Append site id to queue name
This allows multiple CKAN sites to share the same RabbitMQ exchange
(For the Redis backend this is handled via different Redis databases)
2015-06-01 17:54:22 +01:00
amercader 3e21ea4f82 Fix tests, set up Travis
TODO: sort out the tests properly, avoiding imports from the legacy ones
2015-04-07 13:31:45 +01:00
amercader f72d6da521 Change toolkit import
Apparently on package installs this is not well supported

from ckan.plugins.toolkit import check_ckan_version

But this works:

from ckan.plugins import toolkit

toolkit.check_ckan_version(...
2015-03-19 12:48:46 +00:00
amercader 7a20e93716 Raise on startup import errors so we don't mask problems
Otherwise if there was eg an actual ImportError we jut got

2015-03-19 12:30:08,430 DEBUG [ckanext.harvest.plugin] No auth module
for action "update"

on the log
2015-03-19 12:48:15 +00:00
David Read d6e9b80496 Merge pull request #118 from clementmouchet/114-remove_resource_groups
Removed ResourceGroup from query when using CKAN 2.3 or above
2015-02-24 09:56:44 +00:00
clementmouchet ead9e67a33 updated def harvest_source_clear() to delete resource views, resource revisions & resources in CKAN >= 2.3 2015-02-23 17:02:21 +00:00
David Read b3ed6cae5a Merge pull request #121 from metaodi/120-create-remote-orgs
Fetch remote organization via action api
2015-01-15 10:49:09 +00:00
Stefan Oderbolz c1bcee9684 Use str() to get the error message 2015-01-15 11:36:15 +01:00
Stefan Oderbolz 191c39ce5c Catch the more general URLError instead of HTTPError
HTTPError is a subclass of URLError, so catch URLError is enough. I
think the HTTP error code is not as important in this situation, so
catching the more generic error seems like the best solution.
2015-01-15 10:57:24 +01:00
Stefan Oderbolz b978c26e70 Use ContentFetchError instead of generic Exception 2015-01-15 00:49:11 +01:00