Commit Graph

838 Commits

Author SHA1 Message Date
Ross Jones fff9800266 Fix print statements to be Py3 friendly
Fixes the print statements where they were doung to use print as a
function, and also to switch to .format() rather than %.

Also contains some flake8 whitespace changes as I suspect making this
PEP8ish will take several PRs.
2018-06-08 09:53:13 +01:00
seitenbau-govdata 431d202d4e Add tests and documentation for dataset name suffix config 2018-06-01 21:45:29 +02:00
seitenbau-govdata 938d0322f5 Add config option for dataset name append type 2018-06-01 21:43:37 +02:00
MinhChau 0f8d448864 feat: add groups filter config 2018-04-17 13:22:26 +09:00
Stefan Oderbolz 65e3163015
Merge pull request #315 from maxious/fix-solr-erase-all
Don't delete all solr documents/fail to index harvesters when harvest config blank
2018-03-03 09:55:36 +01:00
Alex Sadleir c89ddc02a3 Don't delete all solr documents when harvest config blank 2018-03-03 19:19:54 +11:00
Stefan Oderbolz 2464dd27b3 Add the CSS classes for FontAwesome 4.x
CKAN 2.7.x upgrades to FontAwesome 4.x, so the CSS classes need to be
updated in order to work. But because this extension is used in older
CKAN versions as well, we simply add the new classes, while keeping the
old ones for backwards compatibility.
2018-02-13 16:26:29 +01:00
Knud Möller 717fdb35dd move _last_error_free_job from CKANHarvester to HarvesterBase 2017-11-10 12:19:25 +01:00
etj d1dd4eb227 303: fix clean_tags with tags dict (fixes requested by review) 2017-11-08 13:46:44 +01:00
etj 41aa9c121f 303: fix clean_tags with tags dict 2017-11-07 14:14:50 +01:00
Denis Laxalde 7bb9a2b5e4 Catch sqlalchemy's DatabaseError in fetch and gather callback
I sometimes see "connection timed out" message which are reported as
sqlalchemy.exc.DatabaseError, so by catching the latter exception, it'd
avoid the harvester to be stuck in "limbo" state.

As DatabaseError is a super-class of OperationalError, the latter would
still be catched.
2017-11-02 17:20:31 +01:00
Denis Laxalde cc44d03a41 Drop log.error() call redundant with prior log.exception() call
logging.exception() already logs an ERROR message with exception
information, so there's no need to call both log.exception() and
log.error().

Along the way, make messages uniform in fetch_callback() and
gather_callback().
2017-11-02 17:17:00 +01:00
Adrià Mercader c1661442c9
Merge pull request #298 from ckan/use-api-to-get-dataset-count
Use the package_search API for the count of datasets
2017-11-02 10:31:36 +00:00
Stefan Oderbolz 9bfd5866fb Use package_search to display package count of harvester 2017-11-01 08:24:32 +01:00
Ubuntu 6714d4de58 Move Stop Button 2017-10-26 16:06:48 +00:00
Ubuntu 0d877bf768 Abort Jobs from UI 2017-10-26 14:42:39 +00:00
amercader b589797c47 Merge branch 'master' of github.com:ckan/ckanext-harvest 2017-07-27 16:09:19 +01:00
amercader 881a9581d5 Fix url_for depending tests 2017-07-27 16:09:00 +01:00
etj 207ac81d70 [289] Fix default_extras 2017-05-29 20:11:49 +02:00
David Read cc438786de More explicit checks of the exception thrown when checking harvest config. Also the default_groups test was checking the wrong thing completely. 2017-05-02 20:39:49 +00:00
Adrià Mercader e3b4854b07 Merge pull request #270 from GovDataOfficial/improve-resolving-local-groups
Improve resolving local groups
2017-03-01 12:29:29 +00:00
Mark Gregson dc06f92ec7 Return an empty list when no CKAN datasets are gathered 2017-02-21 12:22:02 +11:00
Adrià Mercader 977589fdd6 Merge pull request #272 from 6aika/master
remove authz as they are not used and they are no longer new
2016-11-23 15:26:45 +02:00
Jari Voutilainen afee362bbc remove authz as they are not used and they are no longer new 2016-11-17 14:14:02 +02:00
rnoerenberg ff1b861f1b Update documentation
Added note with the limit of 1000 harvest sources
2016-11-16 16:02:12 +01:00
David Read e7c03855ca Avoid the "# dont use factory because it looks for the existing source" by copying the SOURCE_DICT each time, rather than letting tests edit the master copy. 2016-11-16 15:40:44 +01:00
seitenbau-govdata 7f76f60ec3 Fixed variable name 2016-11-16 00:34:07 +01:00
seitenbau-govdata d511663038 Sort lists for assert 2016-11-16 00:25:19 +01:00
seitenbau-govdata 95d0c1ca41 Ignoring not existent harvest sources in harvest_sources_job_history_clear
Ignoring not existent harvest sources harvest_sources_job_history_clear because of a possibly corrupt search index.
2016-11-15 23:36:11 +01:00
seitenbau-govdata 0f951d9fc0 Improve resolving local groups
Improve resolving local groups by searching for group additionally by name.
2016-11-15 22:38:27 +01:00
seitenbau-govdata f68bf323f0 Using test class wide unique harvest source url
Using test class wide unique harvest source url, because in a test created objects are still present in following tests.
2016-11-15 22:28:37 +01:00
seitenbau-govdata d01a86680e Fix creating different harvest sources
Fix creating different harvest sources. Different harvest sources can't be created with factory.
2016-11-15 21:56:57 +01:00
seitenbau-govdata 096e746c81 Fixed HarvestSourceObj argument 2016-11-15 21:23:20 +01:00
seitenbau-govdata 8d5ff4b4ef Fixed harvest_sources_job_history_clear test
Fixed harvest_sources_job_history_clear test by creating different harvest sources.
2016-11-15 21:09:42 +01:00
rnoerenberg cf1cfcca48 Fixed using property of object 2016-11-15 15:50:03 +01:00
rnoerenberg 1acab98026 Added tests for clearsource history command 2016-11-15 15:37:26 +01:00
seitenbau-govdata af0e1712b9 Changed filter query for reading harvest sources
Changed filter query for reading harvest sources in according to the code in /ckanext/harvest/plugin.py.
2016-11-15 15:04:01 +01:00
Raphael Stolt e8570b9e50 Add clearsource history command 2016-11-15 15:04:01 +01:00
Andy Gross f2e1dc512c Change 'redirect' calls to 'h.redirect_to'
ckan.lib.base.redirect was removed in [1], guidance is to always use h.redirect_to instead.  Manifested as 503 errors in the harvest UI against an install of CKAN master branch. 

[1] 34f3f18e88
2016-09-18 03:23:29 -07:00
Stefan Oderbolz 4ee772f064 Use toolkit.asbool to parse given boolean parameter 2016-08-31 09:13:36 +02:00
Stefan Oderbolz 8b081e2868 Add new parameter return_last_job_status to harvest_source_list
In order to get a quick overview over successful/failed harvesters, a
call to harvest_source_list with return_last_job_status=true can be used
to get this information.
By default return_last_job_status is False, and hence the extra
resources to grab this information is not wasted for every call, but
only if the client requests is explicitly.
The original 'status' field stays as-is, this PR introduces a new field
called 'last_job_status' to return this information.

The returned information is gather by a call to
harvest_source_status_show.
2016-08-25 17:56:51 +02:00
Florian Brucker 2602de9094 [#257] Purge only our own Redis data.
Previously purging the queue on the Redis backend would clear the whole
database, making it hard to share the same database with other parts of
CKAN. With this commit, only the keys that belong to ckanext-harvest and
the current CKAN instance are purged.
2016-07-20 16:24:13 +02:00
David Read 78933fb775 [#253] Fix default_groups by saving the dicts to the config object, since saving it to the harvester object doesnt work in the real world. This is a lot more efficient than doing group_show for every dataset imported. 2016-06-27 12:01:35 +01:00
Jardel Weyrich e8f539a45e Don't let the user specify mutually exclusive configuration options:
- organizations_filter_include
- organizations_filter_exclude
2016-06-14 11:35:38 -03:00
David Read 18a506a112 [#249] Add test for default_extras. 2016-06-10 09:51:17 +00:00
David Read f1742fb51a Fix default_groups. It accepted a list of package_name/ids and was trying to add this to the package, but the package needs a dict. Added test. 2016-06-10 09:16:32 +00:00
David Read bfc9b8e0d9 [#249] Test and fix docs for default_tags. Needed to improve error handling when saving ValidationError in a HOE. 2016-06-09 22:11:03 +00:00
Adrià Mercader aeab60ece6 Merge pull request #247 from keitaroinc/change-db-logger-pagination-param-name
Change db logger pagination param name
2016-06-06 09:45:57 +01:00
Petar Efnushev 92120ca47e Change db logger pagination limit param name 2016-06-06 10:23:57 +02:00
amercader 5e1512f717 Don't reuse contexts on ckan harvester
Reusing the same context on all calls can lead to hard to debug failures
like

Action function organization_show did not call its auth function

In this case that was caused because the first organization/group_show
raised a NotFound so the auth audit was still in the context. When
organization/group_show was called again at the end of
organization/group_create the auth audit exception was raised.

This commit makes sure that each call has its own context.
2016-05-23 12:20:08 +01:00
amercader 16a6e9fbf6 Add tests for group creation during harvesting 2016-05-23 10:20:52 +01:00
amercader 314be8bc33 Merge branch '240-fix-groups-import' of https://github.com/keitaroinc/ckanext-harvest into keitaroinc-240-fix-groups-import 2016-05-23 09:59:32 +01:00
Petar Efnushev c16ecea7f0 reverted change in default groups validation 2016-05-20 20:15:54 +02:00
Petar Efnushev c154365371 Fixed creation/import of groups and organizations when harvesting from remote ckan instance 2016-05-20 16:38:48 +02:00
Adrià Mercader 1ec2af0590 Merge pull request #230 from keitaroinc/logging-module
Logging module
2016-05-17 14:12:03 +01:00
Petar Efnushev cc6cb3e389 Changed default config params for the database logger
Added database logger test case
README updates
2016-05-16 13:15:12 +02:00
Petar Efnushev 0be2c868cb README updates
DBLogHandler updates
Added harvest_log table migration for existing users
Implemented database log scoping
2016-05-11 13:29:53 +02:00
David Read 623fca5f80 New syntax for pysolr connection 2016-05-10 11:14:35 +00:00
Jari Voutilainen 41c329c6d4 Fix syntax error in template 2016-05-09 14:36:21 +03:00
Petar Efnushev 009cc57e09 Added clean-up mechanism for the harvest log 2016-05-06 18:44:02 +02:00
David Read d372f112f0 Convert deprecated helper. 2016-04-27 15:41:17 +00:00
Jari Voutilainen 633a32075e create index to harvest_object 2016-04-27 09:27:12 +03:00
Petar Efnushev 3d519ce0b2 Partial fixes 2016-04-25 19:53:49 +02:00
Petar Efnushev a1968e4c63 Check if harvest_log table is populated on source creation 2016-04-12 19:28:43 +02:00
Petar Efnushev 97cd64b172 Added harvest_log_list get action 2016-04-05 23:53:14 +02:00
Petar Efnushev a79ad2e325 Added basic DBLogHandler 2016-04-05 20:21:04 +02:00
Jari Voutilainen afbf0f0dfe fix unicode encode error in facet filters with scandinavian alphabets 2016-03-23 10:38:04 +02:00
amercader 15592dab6c Merge branch '227-get-rid-of-action-not-found-error' of https://github.com/smotornyuk/ckanext-harvest into smotornyuk-227-get-rid-of-action-not-found-error 2016-03-21 13:20:42 +00:00
Motornyuk Sergey e1ebde7030 [#227] 'harvest_source_show_status' not found
Added try block and try to clear action's cache in case exception raised
2016-03-21 14:35:09 +02:00
Denis Laxalde a8732553c9 Replace deprecated nav_named_link by nav_link 2016-03-02 11:44:21 +01:00
amercader 7f506913f8 Merge branch '157-version-three-apify' 2016-02-17 10:08:27 +00:00
amercader 9dfeb154eb [#158] Tone down log message 2016-02-17 10:05:57 +00:00
amercader d8fb2ed7f6 [#220] Simplify check 2016-02-17 09:30:19 +00:00
amercader 566939a655 Merge branch 'master' of https://github.com/LondonAppDev/ckanext-harvest into LondonAppDev-master 2016-02-17 09:28:30 +00:00
David Read 84b0462979 No need to go back twice 2016-02-15 15:36:02 +00:00
David Read 794fc93230 Maintain compatibility with rest-style updates 2016-02-15 15:23:39 +00:00
David Read f22100e6c2 Merge remote-tracking branch 'origin' into 157-version-three-apify 2016-02-15 15:20:33 +00:00
David Read bf0d1fd779 Fix name error 2016-02-15 13:54:58 +00:00
David Read 4516bfe44e PEP8 and lint, extracted from PR158 2016-02-15 13:50:18 +00:00
David Read 49faa0ae6c Tests for CKANHarvester._last_error_free_job 2016-02-15 13:30:28 +00:00
David Read 385b369148 Error-free jobs now include ones where an object was not modified. 2016-02-15 13:16:23 +00:00
David Read f63140354d Fix logic error in previous commit 2016-02-15 12:28:46 +00:00
David Read 52c071dbe9 Improved error handling. e.g. if the site it harvests just returns errors. 2016-02-15 12:10:44 +00:00
David Read 331ad84272 Deal with worry about datasets on the remote CKAN being added/removed during harvest. 2016-02-12 18:00:00 +00:00
David Read 7096b7ddf2 Merge branch 'master' of github.com:ckan/ckanext-harvest into 157-version-three-apify 2016-02-12 16:51:26 +00:00
London App Developer 69ea33647e Update plugin.py
Prevent the 'harvest' datasets being shown in the group and dataset search results.
2016-02-12 13:15:09 +00:00
amercader 6354ad5656 Fix source clean command on CKAN > 2.5, as related don't exist any more 2016-02-04 13:40:02 +00:00
amercader b58cd8b38f Merge branch 'master' into 219-missing-form
Conflicts:
	ckanext/harvest/templates_new/source/edit.html
	ckanext/harvest/templates_new/source/new.html
	test-core.ini
2016-01-14 11:43:12 +00:00
amercader 73196b942b Merge branch '214-remove-genshi' 2016-01-14 11:28:55 +00:00
amercader 5d23fab03f [#219] Support CKAN < 2.3 2016-01-14 11:23:41 +00:00
amercader 5bf0ac9b86 [#219] Fix tests auth 2016-01-14 11:23:17 +00:00
amercader 1665e86065 [#219] Don't use c.form to render the form 2016-01-14 10:16:45 +00:00
amercader 497dfeea02 Add test for missing form 2016-01-14 10:15:46 +00:00
Stefan Oderbolz e0c3316531 Add legacy import for CreateTestData 2016-01-11 22:35:11 +01:00
Stefan Oderbolz c141cf44fa Swap import statements to make sure `run_test` works on CKAN 2.3 2016-01-11 22:13:25 +01:00
David Read 82f48d5afa [#214] Move ckanext/harvest/templates/templates_new to ckanext/harvest/templates 2015-12-11 15:24:45 +00:00
amercader 6b23208b2a Merge branch '212-module-import-error' 2015-12-11 13:38:32 +00:00
David Read d08f72ad13 Fix tests for ckan 2.2 again with amercaders suggestion. 2015-12-11 11:53:51 +00:00
David Read 6ef58addbc Fix tests for ckan 2.2 again with amercaders suggestion. 2015-12-11 11:52:00 +00:00
David Read f0ba0c865c Fix tests for ckan 2.2 2015-12-10 16:36:27 +00:00