Remove CKAN <= 2.8 info from README

This commit is contained in:
pdelboca 2023-02-28 13:01:43 -03:00
parent 9f2f84abd2
commit d1d9a4fef3
1 changed files with 10 additions and 95 deletions

View File

@ -94,14 +94,8 @@ Configuration
Run the following command to create the necessary tables in the database (ensuring the pyenv is activated):
ON CKAN >= 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester initdb
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester initdb --config=/etc/ckan/default/production.ini
Finally, restart CKAN to have the changes take effect::
sudo service apache2 restart
@ -213,7 +207,7 @@ IF you want to set a timeout for harvest jobs, you can add this configuration op
ckan.harvest.timeout = 1440
The timeout value is in minutes, so 1440 represents 24 hours.
The timeout value is in minutes, so 1440 represents 24 hours.
Any jobs which are timed out will create an error message for the user to see.
If you don't specify this setting, the default will be False and there will be no timeout on harvest jobs.
@ -289,9 +283,9 @@ The following operations can be run from the command line as described underneat
import) without involving the web UI or the queue backends. This is
useful for testing a harvester without having to fire up
gather/fetch_consumer processes, as is done in production.
harvester run-test {source-id/name} force-import=guid1,guid2...
- In order to force an import of particular datasets, useful to
- In order to force an import of particular datasets, useful to
target a dataset for dev purposes or when forcing imports on other environments.
harvester gather-consumer
@ -335,22 +329,17 @@ The following operations can be run from the command line as described underneat
The commands should be run with the pyenv activated and refer to your CKAN configuration file:
ON CKAN >= 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester --help
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester sources
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester sources --config=/etc/ckan/default/production.ini
**Note that on CKAN >= 2.9 all commands with an underscore in their name changed.** They now use a hyphen instead of an underscore (e.g. ``gather_consumer`` changed to ``gather-consumer``).
Authorization
=============
Starting from CKAN 2.0, harvest sources behave exactly the same as datasets
Harvest sources behave exactly the same as datasets
(they are actually internally implemented as a dataset type). That means they
can be searched and faceted, and that the same authorization rules can be
applied to them. The default authorization settings are based on organizations.
@ -700,10 +689,10 @@ harvester run-test
You can run a harvester simply using the ``run-test`` command. This is handy
for running a harvest with one command in the console and see all the output
in-line. It runs the gather, fetch and import stages all in the same process.
You must ensure that you have pip installed ``dev-requirements.txt``
You must ensure that you have pip installed ``dev-requirements.txt``
in ``/home/ckan/ckan/lib/default/src/ckanext-harvest`` before using the
``run-test`` command.
This is useful for developing a harvester because you can insert break-points
in your harvester, and rerun a harvest without having to restart the
gather_consumer and fetch_consumer processes each time. In addition, because it
@ -727,35 +716,17 @@ handles the gathering and another one that handles the fetching and importing.
To start the consumers run the following command (make sure you have your
python environment activated):
ON CKAN >= 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester gather-consumer
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester gather_consumer --config=/etc/ckan/default/production.ini
On another terminal, run the following command:
ON CKAN >= 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester fetch-consumer
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer --config=/etc/ckan/default/production.ini
Finally, on a third console, run the following command to start any
pending harvesting jobs:
ON CKAN >= 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester run
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/production.ini
The ``run`` command not only starts any pending harvesting jobs, but also
flags those that are finished, allowing new jobs to be created on that particular
source and refreshing the source statistics. That means that you will need to run
@ -771,14 +742,8 @@ circumstance, ensure that the gather & fetch consumers are running and have
nothing more to consume, and then run this abort command with the name or id of
the harvest source:
ON CKAN >= 2.9::
(pyenv) $ ckan --config=/etc/ckan/default/ckan.ini harvester job-abort {source-id/name}
ON CKAN <= 2.8::
(pyenv) $ paster --plugin=ckanext-harvest harvester job_abort {source-id/name} --config=/etc/ckan/default/production.ini
Setting up the harvesters on a production server
================================================
@ -855,42 +820,6 @@ following steps with the one you are using.
startsecs=10
ON CKAN <= 2.8::
; ===============================
; ckan harvester
; ===============================
[program:ckan_gather_consumer]
command=/usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester gather_consumer --config=/etc/ckan/default/production.ini
; user that owns virtual environment.
user=ckan
numprocs=1
stdout_logfile=/var/log/ckan/std/gather_consumer.log
stderr_logfile=/var/log/ckan/std/gather_consumer.log
autostart=true
autorestart=true
startsecs=10
[program:ckan_fetch_consumer]
command=/usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester fetch_consumer --config=/etc/ckan/default/production.ini
; user that owns virtual environment.
user=ckan
numprocs=1
stdout_logfile=/var/log/ckan/std/fetch_consumer.log
stderr_logfile=/var/log/ckan/std/fetch_consumer.log
autostart=true
autorestart=true
startsecs=10
There are a number of things that you will need to replace with your
specific installation settings (the example above shows paths from a
ckan instance installed via Debian packages):
@ -952,16 +881,9 @@ following steps with the one you are using.
Paste this line into your crontab, again replacing the paths to paster and
the ini file with yours:
ON CKAN >= 2.9::
# m h dom mon dow command
*/15 * * * * /usr/lib/ckan/default/bin/ckan -c /etc/ckan/default/ckan.ini harvester run
ON CKAN <= 2.8::
# m h dom mon dow command
*/15 * * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/production.ini
This particular example will check for pending jobs every fifteen minutes.
You can of course modify this periodicity, this `Wikipedia page <http://en.wikipedia.org/wiki/Cron#CRON_expression>`_
has a good overview of the crontab syntax.
@ -973,16 +895,9 @@ following steps with the one you are using.
Paste this line into your crontab, again replacing the paths to paster/ckan and
the ini file with yours:
ON CKAN >= 2.9::
# m h dom mon dow command
0 5 * * * /usr/lib/ckan/default/bin/ckan -c /etc/ckan/default/ckan.ini harvester clean-harvest-log
ON CKAN <= 2.8::
# m h dom mon dow command
0 5 * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester clean_harvest_log --config=/etc/ckan/default/production.ini
This particular example will perform clean-up each day at 05 AM.
You can tweak the value according to your needs.
@ -992,17 +907,17 @@ Extensible actions
Recipients on harvest jobs notifications
----------------------------------------
:code:`harvest_get_notifications_recipients`: you can *chain* this action from another extension to change
:code:`harvest_get_notifications_recipients`: you can *chain* this action from another extension to change
the recipients for harvest jobs notifications.
.. code-block:: python
@toolkit.chained_action
def harvest_get_notifications_recipients(up_func, context, data_dict):
""" Harvester plugin notify by default about harvest jobs only to
""" Harvester plugin notify by default about harvest jobs only to
admin users of the related organization.
Also allow to add custom recipients with this function.
Return a list of dicts with name and email like
{'name': 'John', 'email': 'john@source.com'} """
@ -1021,7 +936,7 @@ Tests
You can run the tests like this::
cd ckanext-harvest
nosetests --reset-db --ckan --with-pylons=test-core.ini ckanext/harvest/tests
pytest --ckan-ini=test.ini ckanext/harvest/tests
Here are some common errors and solutions: