[#260] Improve install docs

This commit is contained in:
David Read 2016-07-29 12:05:12 +01:00
parent b737a419dc
commit d0a8cab479
1 changed files with 28 additions and 26 deletions

View File

@ -22,17 +22,19 @@ running a version lower than 2.0.
* `Redis <http://redis.io/>`_ (recommended): To install it, run:: * `Redis <http://redis.io/>`_ (recommended): To install it, run::
sudo apt-get update
sudo apt-get install redis-server sudo apt-get install redis-server
On your CKAN configuration file, add:: On your CKAN configuration file, add in the `[app:main]` section::
ckan.harvest.mq.type = redis ckan.harvest.mq.type = redis
* `RabbitMQ <http://www.rabbitmq.com/>`_: To install it, run:: * `RabbitMQ <http://www.rabbitmq.com/>`_: To install it, run::
sudo apt-get update
sudo apt-get install rabbitmq-server sudo apt-get install rabbitmq-server
On your CKAN configuration file, add:: On your CKAN configuration file, add in the `[app:main]` section::
ckan.harvest.mq.type = amqp ckan.harvest.mq.type = amqp
@ -44,8 +46,9 @@ running a version lower than 2.0.
(pyenv) $ pip install -e git+https://github.com/ckan/ckanext-harvest.git#egg=ckanext-harvest (pyenv) $ pip install -e git+https://github.com/ckan/ckanext-harvest.git#egg=ckanext-harvest
4. Install the python modules required by the extension:: 4. Install the python modules required by the extension (adjusting the path according to where ckanext-harvest was installed in the previous step)::
(pyenv) $ cd /usr/lib/ckan/default/src/ckanext-harvest/
(pyenv) $ pip install -r pip-requirements.txt (pyenv) $ pip install -r pip-requirements.txt
5. Make sure the CKAN configuration ini file contains the harvest main plugin, as 5. Make sure the CKAN configuration ini file contains the harvest main plugin, as
@ -54,14 +57,12 @@ running a version lower than 2.0.
ckan.plugins = harvest ckan_harvester ckan.plugins = harvest ckan_harvester
6. If you haven't done it yet on the previous step, define the backend that you 6. If you haven't done it yet on the previous step, define the backend that you
are using with the ``ckan.harvest.mq.type`` option (it defaults to ``amqp``):: are using with the ``ckan.harvest.mq.type`` option in the `[app:main]` section (it defaults to ``amqp``)::
ckan.harvest.mq.type = redis ckan.harvest.mq.type = redis
There are a number of configuration options available for the backends. These don't need to There are a number of configuration options available for the backends. These don't need to be modified at all if you are using the default Redis or RabbitMQ install (step 1). However you may wish to add them with custom options to the into the CKAN config file the `[app:main]` section. The list below shows the available options and their default values:
be modified at all if you are using the default Redis or RabbitMQ install (step 1). The list
below shows the available options and their default values:
* Redis: * Redis:
- ``ckan.harvest.mq.hostname`` (localhost) - ``ckan.harvest.mq.hostname`` (localhost)
@ -90,9 +91,9 @@ config option (or ``default``) will be used to namespace the relevant things:
Configuration Configuration
============= =============
Run the following command to create the necessary tables in the database:: Run the following command to create the necessary tables in the database (ensuring the pyenv is activated)::
paster --plugin=ckanext-harvest harvester initdb --config=mysite.ini (pyenv) $ paster --plugin=ckanext-harvest harvester initdb --config=/etc/ckan/default/ckan.ini
Finally, restart CKAN to have the changes take affect: Finally, restart CKAN to have the changes take affect:
@ -100,7 +101,8 @@ Finally, restart CKAN to have the changes take affect:
After installation, the harvest source listing should be available under /harvest, eg: After installation, the harvest source listing should be available under /harvest, eg:
http://localhost:5000/harvest http://localhost/harvest
Database logger configuration(optional) Database logger configuration(optional)
======================================= =======================================
@ -121,7 +123,7 @@ Database logger configuration(optional)
* 6 - plugin * 6 - plugin
* 7 - harvesters * 7 - harvesters
2. Setup time frame(in days) for the clean-up mechanism with the following config parameter:: 2. Setup time frame(in days) for the clean-up mechanism with the following config parameter (in the `[app:main]` section)::
ckan.harvest.log_timeframe = 10 ckan.harvest.log_timeframe = 10
@ -168,8 +170,7 @@ e.g. Fetch all logs with log level INFO:
Command line interface Command line interface
====================== ======================
The following operations can be run from the command line using the The following operations can be run from the command line as described underneath::
``paster --plugin=ckanext-harvest harvester`` command::
harvester initdb harvester initdb
- Creates the necessary tables in the database - Creates the necessary tables in the database
@ -255,9 +256,9 @@ The following operations can be run from the command line using the
harvester reindex harvester reindex
- reindexes the harvest source datasets - reindexes the harvest source datasets
The commands should be run with the pyenv activated and refer to your sites configuration file (mysite.ini in this example):: The commands should be run with the pyenv activated and refer to your CKAN configuration file::
paster --plugin=ckanext-harvest harvester sources --config=mysite.ini (pyenv) $ paster --plugin=ckanext-harvest harvester sources --config=/etc/ckan/default/ckan.ini
Authorization Authorization
============= =============
@ -589,16 +590,16 @@ handles the gathering and another one that handles the fetching and importing.
To start the consumers run the following command (make sure you have your To start the consumers run the following command (make sure you have your
python environment activated):: python environment activated)::
paster --plugin=ckanext-harvest harvester gather_consumer --config=mysite.ini (pyenv) $ paster --plugin=ckanext-harvest harvester gather_consumer --config=/etc/ckan/default/ckan.ini
On another terminal, run the following command:: On another terminal, run the following command::
paster --plugin=ckanext-harvest harvester fetch_consumer --config=mysite.ini (pyenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer --config=/etc/ckan/default/ckan.ini
Finally, on a third console, run the following command to start any Finally, on a third console, run the following command to start any
pending harvesting jobs:: pending harvesting jobs::
paster --plugin=ckanext-harvest harvester run --config=mysite.ini (pyenv) $ paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/ckan.ini
The ``run`` command not only starts any pending harvesting jobs, but also The ``run`` command not only starts any pending harvesting jobs, but also
flags those that are finished, allowing new jobs to be created on that particular flags those that are finished, allowing new jobs to be created on that particular
@ -615,7 +616,7 @@ circumstance, ensure that the gather & fetch consumers are running and have
nothing more to consume, and then run this abort command with the name or id of nothing more to consume, and then run this abort command with the name or id of
the harvest source:: the harvest source::
paster --plugin=ckanext-harvest harvester job_abort {source-id/name} --config=mysite.ini (pyenv) $ paster --plugin=ckanext-harvest harvester job_abort {source-id/name} --config=/etc/ckan/default/ckan.ini
Setting up the harvesters on a production server Setting up the harvesters on a production server
@ -640,6 +641,7 @@ following steps with the one you are using.
1. Install Supervisor:: 1. Install Supervisor::
sudo apt-get update
sudo apt-get install supervisor sudo apt-get install supervisor
You can check if it is running with this command:: You can check if it is running with this command::
@ -664,7 +666,7 @@ following steps with the one you are using.
[program:ckan_gather_consumer] [program:ckan_gather_consumer]
command=/usr/lib//ckan/default/bin/paster --plugin=ckanext-harvest harvester gather_consumer --config=/etc/ckan/std/std.ini command=/usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester gather_consumer --config=/etc/ckan/default/ckan.ini
; user that owns virtual environment. ; user that owns virtual environment.
user=ckan user=ckan
@ -678,7 +680,7 @@ following steps with the one you are using.
[program:ckan_fetch_consumer] [program:ckan_fetch_consumer]
command=/usr/lib//ckan/default/bin/paster --plugin=ckanext-harvest harvester fetch_consumer --config=/etc/ckan/std/std.ini command=/usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester fetch_consumer --config=/etc/ckan/default/ckan.ini
; user that owns virtual environment. ; user that owns virtual environment.
user=ckan user=ckan
@ -753,7 +755,7 @@ following steps with the one you are using.
the ini file with yours:: the ini file with yours::
# m h dom mon dow command # m h dom mon dow command
*/15 * * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/std/std.ini */15 * * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester run --config=/etc/ckan/default/ckan.ini
This particular example will check for pending jobs every fifteen minutes. This particular example will check for pending jobs every fifteen minutes.
You can of course modify this periodicity, this `Wikipedia page <http://en.wikipedia.org/wiki/Cron#CRON_expression>`_ You can of course modify this periodicity, this `Wikipedia page <http://en.wikipedia.org/wiki/Cron#CRON_expression>`_
@ -767,7 +769,7 @@ following steps with the one you are using.
the ini file with yours:: the ini file with yours::
# m h dom mon dow command # m h dom mon dow command
0 5 * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester clean_harvest_log --config=/etc/ckan/std/std.ini 0 5 * * * /usr/lib/ckan/default/bin/paster --plugin=ckanext-harvest harvester clean_harvest_log --config=/etc/ckan/default/ckan.ini
This particular example will perform clean-up each day at 05 AM. This particular example will perform clean-up each day at 05 AM.
You can tweak the value according to your needs. You can tweak the value according to your needs.