Commit Graph

787 Commits

Author SHA1 Message Date
Stefanie Taepke 5f414e2c99 add error-notifications in case of harvest-failure
requires configuration-values:
ckan.harvest.status_mail.errored = true
email_to = %comma seperated list of emails to receive the email%
error_email_from = %sender of email%
ckan.site_title = %will be included in the email%
2018-06-11 13:38:38 +02:00
seitenbau-govdata 431d202d4e Add tests and documentation for dataset name suffix config 2018-06-01 21:45:29 +02:00
seitenbau-govdata 938d0322f5 Add config option for dataset name append type 2018-06-01 21:43:37 +02:00
Stefan Oderbolz 65e3163015
Merge pull request #315 from maxious/fix-solr-erase-all
Don't delete all solr documents/fail to index harvesters when harvest config blank
2018-03-03 09:55:36 +01:00
Alex Sadleir c89ddc02a3 Don't delete all solr documents when harvest config blank 2018-03-03 19:19:54 +11:00
Stefan Oderbolz 2464dd27b3 Add the CSS classes for FontAwesome 4.x
CKAN 2.7.x upgrades to FontAwesome 4.x, so the CSS classes need to be
updated in order to work. But because this extension is used in older
CKAN versions as well, we simply add the new classes, while keeping the
old ones for backwards compatibility.
2018-02-13 16:26:29 +01:00
Knud Möller 717fdb35dd move _last_error_free_job from CKANHarvester to HarvesterBase 2017-11-10 12:19:25 +01:00
etj d1dd4eb227 303: fix clean_tags with tags dict (fixes requested by review) 2017-11-08 13:46:44 +01:00
etj 41aa9c121f 303: fix clean_tags with tags dict 2017-11-07 14:14:50 +01:00
Denis Laxalde 7bb9a2b5e4 Catch sqlalchemy's DatabaseError in fetch and gather callback
I sometimes see "connection timed out" message which are reported as
sqlalchemy.exc.DatabaseError, so by catching the latter exception, it'd
avoid the harvester to be stuck in "limbo" state.

As DatabaseError is a super-class of OperationalError, the latter would
still be catched.
2017-11-02 17:20:31 +01:00
Denis Laxalde cc44d03a41 Drop log.error() call redundant with prior log.exception() call
logging.exception() already logs an ERROR message with exception
information, so there's no need to call both log.exception() and
log.error().

Along the way, make messages uniform in fetch_callback() and
gather_callback().
2017-11-02 17:17:00 +01:00
Adrià Mercader c1661442c9
Merge pull request #298 from ckan/use-api-to-get-dataset-count
Use the package_search API for the count of datasets
2017-11-02 10:31:36 +00:00
Stefan Oderbolz 9bfd5866fb Use package_search to display package count of harvester 2017-11-01 08:24:32 +01:00
Ubuntu 6714d4de58 Move Stop Button 2017-10-26 16:06:48 +00:00
Ubuntu 0d877bf768 Abort Jobs from UI 2017-10-26 14:42:39 +00:00
amercader b589797c47 Merge branch 'master' of github.com:ckan/ckanext-harvest 2017-07-27 16:09:19 +01:00
amercader 881a9581d5 Fix url_for depending tests 2017-07-27 16:09:00 +01:00
etj 207ac81d70 [289] Fix default_extras 2017-05-29 20:11:49 +02:00
David Read cc438786de More explicit checks of the exception thrown when checking harvest config. Also the default_groups test was checking the wrong thing completely. 2017-05-02 20:39:49 +00:00
Adrià Mercader e3b4854b07 Merge pull request #270 from GovDataOfficial/improve-resolving-local-groups
Improve resolving local groups
2017-03-01 12:29:29 +00:00
Mark Gregson dc06f92ec7 Return an empty list when no CKAN datasets are gathered 2017-02-21 12:22:02 +11:00
Adrià Mercader 977589fdd6 Merge pull request #272 from 6aika/master
remove authz as they are not used and they are no longer new
2016-11-23 15:26:45 +02:00
Jari Voutilainen afee362bbc remove authz as they are not used and they are no longer new 2016-11-17 14:14:02 +02:00
rnoerenberg ff1b861f1b Update documentation
Added note with the limit of 1000 harvest sources
2016-11-16 16:02:12 +01:00
David Read e7c03855ca Avoid the "# dont use factory because it looks for the existing source" by copying the SOURCE_DICT each time, rather than letting tests edit the master copy. 2016-11-16 15:40:44 +01:00
seitenbau-govdata 7f76f60ec3 Fixed variable name 2016-11-16 00:34:07 +01:00
seitenbau-govdata d511663038 Sort lists for assert 2016-11-16 00:25:19 +01:00
seitenbau-govdata 95d0c1ca41 Ignoring not existent harvest sources in harvest_sources_job_history_clear
Ignoring not existent harvest sources harvest_sources_job_history_clear because of a possibly corrupt search index.
2016-11-15 23:36:11 +01:00
seitenbau-govdata 0f951d9fc0 Improve resolving local groups
Improve resolving local groups by searching for group additionally by name.
2016-11-15 22:38:27 +01:00
seitenbau-govdata f68bf323f0 Using test class wide unique harvest source url
Using test class wide unique harvest source url, because in a test created objects are still present in following tests.
2016-11-15 22:28:37 +01:00
seitenbau-govdata d01a86680e Fix creating different harvest sources
Fix creating different harvest sources. Different harvest sources can't be created with factory.
2016-11-15 21:56:57 +01:00
seitenbau-govdata 096e746c81 Fixed HarvestSourceObj argument 2016-11-15 21:23:20 +01:00
seitenbau-govdata 8d5ff4b4ef Fixed harvest_sources_job_history_clear test
Fixed harvest_sources_job_history_clear test by creating different harvest sources.
2016-11-15 21:09:42 +01:00
rnoerenberg cf1cfcca48 Fixed using property of object 2016-11-15 15:50:03 +01:00
rnoerenberg 1acab98026 Added tests for clearsource history command 2016-11-15 15:37:26 +01:00
seitenbau-govdata af0e1712b9 Changed filter query for reading harvest sources
Changed filter query for reading harvest sources in according to the code in /ckanext/harvest/plugin.py.
2016-11-15 15:04:01 +01:00
Raphael Stolt e8570b9e50 Add clearsource history command 2016-11-15 15:04:01 +01:00
Andy Gross f2e1dc512c Change 'redirect' calls to 'h.redirect_to'
ckan.lib.base.redirect was removed in [1], guidance is to always use h.redirect_to instead.  Manifested as 503 errors in the harvest UI against an install of CKAN master branch. 

[1] 34f3f18e88
2016-09-18 03:23:29 -07:00
Stefan Oderbolz 4ee772f064 Use toolkit.asbool to parse given boolean parameter 2016-08-31 09:13:36 +02:00
Stefan Oderbolz 8b081e2868 Add new parameter return_last_job_status to harvest_source_list
In order to get a quick overview over successful/failed harvesters, a
call to harvest_source_list with return_last_job_status=true can be used
to get this information.
By default return_last_job_status is False, and hence the extra
resources to grab this information is not wasted for every call, but
only if the client requests is explicitly.
The original 'status' field stays as-is, this PR introduces a new field
called 'last_job_status' to return this information.

The returned information is gather by a call to
harvest_source_status_show.
2016-08-25 17:56:51 +02:00
Florian Brucker 2602de9094 [#257] Purge only our own Redis data.
Previously purging the queue on the Redis backend would clear the whole
database, making it hard to share the same database with other parts of
CKAN. With this commit, only the keys that belong to ckanext-harvest and
the current CKAN instance are purged.
2016-07-20 16:24:13 +02:00
David Read 78933fb775 [#253] Fix default_groups by saving the dicts to the config object, since saving it to the harvester object doesnt work in the real world. This is a lot more efficient than doing group_show for every dataset imported. 2016-06-27 12:01:35 +01:00
Jardel Weyrich e8f539a45e Don't let the user specify mutually exclusive configuration options:
- organizations_filter_include
- organizations_filter_exclude
2016-06-14 11:35:38 -03:00
David Read 18a506a112 [#249] Add test for default_extras. 2016-06-10 09:51:17 +00:00
David Read f1742fb51a Fix default_groups. It accepted a list of package_name/ids and was trying to add this to the package, but the package needs a dict. Added test. 2016-06-10 09:16:32 +00:00
David Read bfc9b8e0d9 [#249] Test and fix docs for default_tags. Needed to improve error handling when saving ValidationError in a HOE. 2016-06-09 22:11:03 +00:00
Adrià Mercader aeab60ece6 Merge pull request #247 from keitaroinc/change-db-logger-pagination-param-name
Change db logger pagination param name
2016-06-06 09:45:57 +01:00
Petar Efnushev 92120ca47e Change db logger pagination limit param name 2016-06-06 10:23:57 +02:00
amercader 5e1512f717 Don't reuse contexts on ckan harvester
Reusing the same context on all calls can lead to hard to debug failures
like

Action function organization_show did not call its auth function

In this case that was caused because the first organization/group_show
raised a NotFound so the auth audit was still in the context. When
organization/group_show was called again at the end of
organization/group_create the auth audit exception was raised.

This commit makes sure that each call has its own context.
2016-05-23 12:20:08 +01:00
amercader 16a6e9fbf6 Add tests for group creation during harvesting 2016-05-23 10:20:52 +01:00