added CKAN 2.9 commands for harvesterext to README

This commit is contained in:
Castro0o 2020-10-20 18:08:48 +02:00
parent 8cde93c153
commit 4badc3c0ea
1 changed files with 99 additions and 10 deletions

View File

@ -92,7 +92,13 @@ config option (or ``default``) will be used to namespace the relevant things:
Configuration Configuration
============= =============
Run the following command to create the necessary tables in the database (ensuring the pyenv is activated):: Run the following command to create the necessary tables in the database (ensuring the pyenv is activated):
ON CKAN == 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester initdb
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester initdb --config=/etc/ckan/default/production.ini (pyenv) $ paster --plugin=ckanext-harvest harvester initdb --config=/etc/ckan/default/production.ini
@ -315,7 +321,15 @@ The following operations can be run from the command line as described underneat
harvester reindex harvester reindex
- reindexes the harvest source datasets - reindexes the harvest source datasets
The commands should be run with the pyenv activated and refer to your CKAN configuration file:: The commands should be run with the pyenv activated and refer to your CKAN configuration file:
ON CKAN == 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester --help
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester sources
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester sources --config=/etc/ckan/default/production.ini (pyenv) $ paster --plugin=ckanext-harvest harvester sources --config=/etc/ckan/default/production.ini
@ -697,16 +711,34 @@ normal method in production systems and scales well.
In this case, the harvesting extension uses two different queues: one that In this case, the harvesting extension uses two different queues: one that
handles the gathering and another one that handles the fetching and importing. handles the gathering and another one that handles the fetching and importing.
To start the consumers run the following command (make sure you have your To start the consumers run the following command (make sure you have your
python environment activated):: python environment activated):
ON CKAN == 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester gather_consumer
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester gather_consumer --config=/etc/ckan/default/production.ini (pyenv) $ paster --plugin=ckanext-harvest harvester gather_consumer --config=/etc/ckan/default/production.ini
On another terminal, run the following command:: On another terminal, run the following command:
ON CKAN == 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester fetch_consumer
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer --config=/etc/ckan/default/production.ini (pyenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer --config=/etc/ckan/default/production.ini
Finally, on a third console, run the following command to start any Finally, on a third console, run the following command to start any
pending harvesting jobs:: pending harvesting jobs:
ON CKAN == 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester run
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/production.ini (pyenv) $ paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/production.ini
@ -723,7 +755,13 @@ finished, and therefore you cannot run another job. This is due to particular
harvester not handling errors correctly e.g. during development. In this harvester not handling errors correctly e.g. during development. In this
circumstance, ensure that the gather & fetch consumers are running and have circumstance, ensure that the gather & fetch consumers are running and have
nothing more to consume, and then run this abort command with the name or id of nothing more to consume, and then run this abort command with the name or id of
the harvest source:: the harvest source:
ON CKAN == 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester job_abort {source-id/name}
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester job_abort {source-id/name} --config=/etc/ckan/default/production.ini (pyenv) $ paster --plugin=ckanext-harvest harvester job_abort {source-id/name} --config=/etc/ckan/default/production.ini
@ -766,7 +804,44 @@ following steps with the one you are using.
stored in ``/etc/supervisor/conf.d``. stored in ``/etc/supervisor/conf.d``.
Create a file named ``/etc/supervisor/conf.d/ckan_harvesting.conf``, and Create a file named ``/etc/supervisor/conf.d/ckan_harvesting.conf``, and
copy the following contents:: copy the following contents:
ON CKAN == 2.9::
; ===============================
; ckan harvester
; ===============================
[program:ckan_gather_consumer]
command=/usr/lib/ckan/default/bin/ckan --config=/etc/ckan/default/ckan.ini harvester gather_consumer
; user that owns virtual environment.
user=ckan
numprocs=1
stdout_logfile=/var/log/ckan/std/gather_consumer.log
stderr_logfile=/var/log/ckan/std/gather_consumer.log
autostart=true
autorestart=true
startsecs=10
[program:ckan_fetch_consumer]
command=/usr/lib/ckan/default/bin/ckan --config=/etc/ckan/default/ckan.ini harvester fetch_consumer
; user that owns virtual environment.
user=ckan
numprocs=1
stdout_logfile=/var/log/ckan/std/fetch_consumer.log
stderr_logfile=/var/log/ckan/std/fetch_consumer.log
autostart=true
autorestart=true
startsecs=10
ON CKAN <= 2.8::
; =============================== ; ===============================
@ -861,7 +936,14 @@ following steps with the one you are using.
processes to be run with (`ckan` in our example). processes to be run with (`ckan` in our example).
Paste this line into your crontab, again replacing the paths to paster and Paste this line into your crontab, again replacing the paths to paster and
the ini file with yours:: the ini file with yours:
ON CKAN == 2.9::
# m h dom mon dow command
*/15 * * * * /usr/lib/ckan/default/bin/ckan -c /etc/ckan/default/ckan.ini harvester run
ON CKAN <= 2.8::
# m h dom mon dow command # m h dom mon dow command
*/15 * * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/production.ini */15 * * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/production.ini
@ -874,8 +956,15 @@ following steps with the one you are using.
sudo crontab -e -u ckan sudo crontab -e -u ckan
Paste this line into your crontab, again replacing the paths to paster and Paste this line into your crontab, again replacing the paths to paster/ckan and
the ini file with yours:: the ini file with yours:
ON CKAN == 2.9::
# m h dom mon dow command
0 5 * * * /usr/lib/ckan/default/bin/ckan -c /etc/ckan/default/ckan.ini harvester clean_harvest_log
ON CKAN <= 2.8::
# m h dom mon dow command # m h dom mon dow command
0 5 * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester clean_harvest_log --config=/etc/ckan/default/production.ini 0 5 * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester clean_harvest_log --config=/etc/ckan/default/production.ini