Ken Tsang
93a839efce
Use == rather than `is` for string comparison
2019-04-12 11:31:04 +01:00
Gavin Cannizzaro
9f8b98e54f
Use Redis password from configuration when present.
2018-07-12 10:14:24 -04:00
Denis Laxalde
7bb9a2b5e4
Catch sqlalchemy's DatabaseError in fetch and gather callback
...
I sometimes see "connection timed out" message which are reported as
sqlalchemy.exc.DatabaseError, so by catching the latter exception, it'd
avoid the harvester to be stuck in "limbo" state.
As DatabaseError is a super-class of OperationalError, the latter would
still be catched.
2017-11-02 17:20:31 +01:00
Denis Laxalde
cc44d03a41
Drop log.error() call redundant with prior log.exception() call
...
logging.exception() already logs an ERROR message with exception
information, so there's no need to call both log.exception() and
log.error().
Along the way, make messages uniform in fetch_callback() and
gather_callback().
2017-11-02 17:17:00 +01:00
Florian Brucker
2602de9094
[ #257 ] Purge only our own Redis data.
...
Previously purging the queue on the Redis backend would clear the whole
database, making it hard to share the same database with other parts of
CKAN. With this commit, only the keys that belong to ckanext-harvest and
the current CKAN instance are purged.
2016-07-20 16:24:13 +02:00
David Read
b0780b2062
Fetch stage can also return "unchanged", same as the import stage. Used by DGU. It is useful to skip an object like this, to avoid saving the fetched content in a HarvestObject (saves disk usage).
2015-12-01 17:38:57 +00:00
amercader
f1ba2bcfb3
Namespace Redis keys to avoid conflicts between instances
...
The `ckan.site_id` config option (or `default` if missing) is used to
namespace the Redis keys: routing key and persistance key. Consumers
will only get the relevant keys for their instance.
2015-11-20 14:17:25 +00:00
amercader
920df684ae
Merge branch 'db-error'
2015-11-20 12:29:37 +00:00
amercader
ede50aa3fb
Merge branch 'immediate-harvest'
2015-11-20 12:28:35 +00:00
David Read
59be6e2c71
Merge branch 'master' into db-error
...
Conflicts:
ckanext/harvest/queue.py
2015-11-03 00:57:14 +00:00
David Read
8a7bc9e1d8
Merge remote-tracking branch 'origin/master' into immediate-harvest
...
Conflicts:
README.rst
ckanext/harvest/commands/harvester.py
ckanext/harvest/logic/action/create.py
ckanext/harvest/logic/action/update.py
ckanext/harvest/logic/auth/update.py
2015-11-03 00:40:25 +00:00
David Read
e59760fefe
Merge branch 'job-reporting-fixes' of https://github.com/yhteentoimivuuspalvelut/ckanext-harvest into yhteentoimivuuspalvelut-job-reporting-fixes
2015-11-02 21:25:32 +00:00
David Read
f1d2d5fdc4
[ #111 ] Run jobs straight away.
2015-10-28 21:58:36 +00:00
David Read
421e6da660
Add run_test, job_abort, source commands
...
* run_test - for running a whole harvest on the command-line
* job_abort - for aborting a limbo job
* source - for showing a single harvest source
* allowing a source to be specified by name in several commands
2015-10-28 17:51:58 +00:00
David Read
0c0a996b85
Merge branch 'master' into db-error
...
Conflicts:
ckanext/harvest/queue.py
2015-10-23 13:33:44 +01:00
amercader
2f4adfb338
Merge branch 'tests'
2015-10-23 13:18:15 +01:00
amercader
3c6cc55be0
Only flush keys on the current Redis database
2015-10-23 11:52:22 +01:00
amercader
fdbade465f
Merge branch 'master' into purge
2015-10-23 11:33:43 +01:00
David Read
f70c16bce7
Add framework for testing harvesters. Modernize existing tests.
2015-10-21 16:26:57 +00:00
David Read
d1f84295f8
purge_queues command now has warning about impact of Redis flushall, plus add some (log) output when you run a purge.
2015-10-21 16:12:40 +00:00
David Read
1a6dca7c00
[ #148 ] Catch a more specific exception.
2015-10-01 12:30:40 +01:00
David Read
de17e0ae8c
Catch, record and recover from temporary db problems.
2015-07-22 10:25:11 +01:00
David Read
46f7b32b04
Merge branch 'master' of github.com:okfn/ckanext-harvest into migration-states
2015-07-22 10:13:55 +01:00
David Read
2da918c2e4
Fix migration for old harvests so that ones that errored are correctly marked. Added helpful comments in model.
2015-07-22 10:13:02 +01:00
amercader
9f8aae3a18
Append site id to queue name
...
This allows multiple CKAN sites to share the same RabbitMQ exchange
(For the Redis backend this is handled via different Redis databases)
2015-06-01 17:54:22 +01:00
Jari Voutilainen
859133fe36
move detecting unchanged datasets to ckanharvester and queue.py
2015-03-10 14:48:41 +02:00
Jari Voutilainen
97f09913cf
fix job reporting all datasets deleted when actually nothing changed during last two harvests
2014-09-10 09:22:44 +03:00
amercader
55d2b4e304
Fix purge command
2013-10-16 12:59:23 +01:00
amercader
f89f12203c
Merge branch 'fix/rename-ampq-to-amqp' of git://github.com/opendatatrentino/ckanext-harvest into opendatatrentino-fix/rename-ampq-to-amqp
2013-10-04 17:24:53 +01:00
Samuele Santi
611b9aab6d
Fixed typo: ampq -> amqp
2013-09-19 11:43:03 +02:00
amercader
cc3f3d3426
[ #50 ] Fix objects deletion on gather exceptions
2013-07-05 13:29:11 +01:00
amercader
e2696b98bb
[ #50 ] Save all dates as UTC in the database
...
At some point we may want to transform these to local time at the
dictization level. We will need a library like dateutil to handle it
properly though.
2013-07-04 14:59:27 +01:00
amercader
9041f3f3ad
Changes in Redis conusmer to make tests work
2013-04-22 18:08:19 +01:00
kindly
dcfd201cdd
[ #32 ] redis queue support
2013-04-21 17:04:57 +01:00
kindly
0ce59a29b6
delete insead of update harvest objects when error
2013-04-12 12:32:33 +01:00
kindly
7d7657f94a
make gather phase as finished if there is an error
2013-04-12 10:35:08 +01:00
kindly
0b5c3c608a
catch and raise gather exception, acking the message
2013-03-25 11:57:57 +00:00
kindly
634a0bbd30
return instead of continue
2013-03-19 01:21:20 +00:00
kindly
3adf38105e
readd code from old branch seperating the fetch and import logic
2013-03-19 01:16:43 +00:00
amercader
d77f16aba9
[ #21 ] Improve gather stage error handling
...
See issue for full details. Basically we don't want to catch any
exception at the queue.py level, as they prevent debugging. Harvesters
should deal with them and return a list of ids or an empty list if no
objects need to be fetched.
Also improved the debug messages.
2013-03-14 17:31:07 +00:00
amercader
5c17a525c1
Refresh session after each harvest stage
...
Otherwise the eg the source config got cached and you needed to restart
the consumers to refresh it.
2013-03-01 12:55:59 +00:00
kindly
ebe246fe99
make report emit added so shows up on front end
2013-02-22 17:32:33 +00:00
kindly
acb17ff3b0
capture errors more cleanly
2013-01-10 10:48:48 +00:00
kindly
36389e7ce0
make sure gather phase finishes job if there is a severe error
2012-12-24 12:21:21 +00:00
kindly
6b42d96fe0
add report_status field
2012-12-17 23:50:26 +00:00
amercader
0dde483992
Set job status to Finished when actually finishing it
...
Until now, harvest jobs were set to Finished just after sending all
objects to the fetch stage. Now every time the run command is run, jobs
are set to Running, and all previous Running jobs are checked to see if
all harvest objects have a state of Complete or Error. Only then the job
is flagged as Finished.
2012-12-13 18:19:22 +00:00
amercader
37efb3b978
Set harvest object state depending on the output of import_stage
...
Either to COMPLETE or ERROR, depending on whether it returns True or
False.
2012-12-13 14:30:13 +00:00
amercader
03fd1884f4
Implement retry times for harvest objects
2012-11-15 18:11:35 +00:00
kindly
c9c1eb4848
use generator to consume
2012-11-15 14:14:55 +00:00
amercader
33d5e09722
Change fetch_callback to proper acknowledge objects
2012-11-15 11:36:06 +00:00