e244a77296
PYCSW_CRON_DAYS_INTERVAL |
||
---|---|---|
.github/workflows | ||
apache | ||
ckan | ||
ckan-pycsw | ||
doc/img | ||
nginx | ||
postgresql | ||
samples | ||
solr | ||
.env.example | ||
.gitignore | ||
README.md | ||
docker-compose.dev.yml | ||
docker-compose.ghcr.yml | ||
docker-compose.nginx.yml | ||
docker-compose.yml |
README.md
CKAN Docker Compose - Open Data & GIS
Overview • Branch roadmap • Environment: docker • Install CKAN • CKAN images • Extending guide • Applying patches • Addons
Requirements:
Overview
Contains Docker images for the different components of CKAN Cloud and a Docker compose environment (based on ckan) for development and testing Open Data portals.
Warning:
This is a custom installation of Docker Compose with specific extensions for spatial data and GeoDCAT-AP/INSPIRE metadata profiles. For official installations, please have a look: CKAN documentation: Installation.
Available components:
- CKAN custom multi-stage build with spatial capabilities from ckan-docker-spatial1, an image used as a base and built from the official CKAN repo. The following versions of CKAN are available:
CKAN Version | Type | Docker tag | Notes |
---|---|---|---|
2.9.8 | custom image | ghcr.io/mjanez/ckan-spatial:ckan-2.9.8 |
Stable version with CKAN 2.9.8 |
master | custom image | ghcr.io/mjanez/ckan-spatial:master |
Latest version. |
The non-CKAN images are as follows:
- PostgreSQL: Custom image based on official PostgreSQL image. Database files are stored in a named volume.
- Solr: CKAN's pre-configured Solr image. The index data is stored in a named volume and has a spatial schema. 2
- Redis: standard Redis image
- Apache HTTP Server: Custom image based on official latest stable httpd image. Configured to serve multiple routes for the ckan-pycsw CSW endpoint (
{CKAN_SITE_URL}/csw
) and CKAN ({CKAN_SITE_URL}/catalog
). - ckan-pycsw: Custom image based on pycsw CKAN harvester ISO19139 for INSPIRE Metadata CSW Endpoint.
Optional HTTP Endpoint (docker-compose.nginx.yml
):
docker-compose.nginx.yml
:- NGINX: latest stable nginx image that includes SSL and Non-SSL endpoints instead of Apache HTTP Server. No locations, no ckan-pycsw, only CKAN.
Compose files | Repository | Type | Docker tag | Size | Notes |
---|---|---|---|---|---|
docker-compose.yml / docker-compose.nginx.yml |
CKAN 2.9.8 | custom image | mjanez/ckan-spatial:ckan-2.9.8 |
800 MB | Custom Dockerfile: ckan/Dockerfile |
docker-compose.yml / docker-compose.nginx.yml |
PostgreSQL 15.2 | base image | postgres/postgres:15-alpine |
89.74 MB | Custom Dockerfile: postgresql/Dockerfile |
docker-compose.yml / docker-compose.nginx.yml |
Solr 8.11.1 | custom image | ckan/ckan-solr:2.9-solr8-spatial |
331.1 MB | CKAN's pre-configured spatial Solr image. |
docker-compose.yml / docker-compose.nginx.yml |
Redis 7.0.10 | base image | redis/redis:7-alpine |
11.82 MB | - |
docker-compose.yml |
Apache HTTP Server 2.4 | custom image | httpd/httpd:2.4 |
54.47 MB | Custom Dockerfile: apache/Dockerfile |
docker-compose.yml |
pycsw CKAN harvester ISO19139 | custom image | mjanez/ckan-pycsw:latest |
175 MB | Custom Dockerfile: ckan-pycsw/Dockerfile |
docker-compose.nginx.yml |
NGINX 1.22.1 | base image | nginx:stable-alpine |
9.74 MB | No routing, only CKAN. Custom Dockerfile: nginx/Dockerfile |
The site is configured using environment variables that you can set in the .env
file for an Apache HTTP Server and ckan-pycsw deployment (default .env.example
), or replace it with the .env.nginx.example
for a NGINX and CKAN-only deployment using the Docker Compose file: docker-compose.nginx.yml
.
ckan-docker roadmap
Information about extensions installed in the main
image. More info described in the Extending the base images
Note
Switch branches to see theroadmap
for other projects: ckan-docker/branches
Element | Description | version | Status | DEV3 | PRO4 | Remarks |
---|---|---|---|---|---|---|
Core | CKAN | 2.9.8 | Completed | ✔️ | ✔️ | Stable installation for version 2.9.8 (Production & Dev images) via Docker Compose based on official images). Initial configuration, basic customisation and operation guide. |
Core + | Datastore | 2.9.8 | Completed | ✔️ | ✔️ | Stable installation (Production & Dev images) via Docker Compose. |
Core + | 0.0.19 | Deprecated | ❌ | ❌ | Updated to xloader, an express Loader - quickly load data into DataStore. | |
Extension | ckanext-xloader | 0.12.2 | Completed | ✔️ | ✔️ | Stable installation, a replacement for DataPusher because it offers ten times the speed and more robustness |
Extension | ckanext-harvest | 1.5.1 | Completed | ✔️ | ✔️ | Stable installation, necessary for the implementation of the Collector (ogc_ckan) |
Extension | ckanext-geoview | 0.0.20 | Completed | ✔️ | ✔️ | Stable installation. |
Extension | ckanext-spatial | 2.0.0 | Completed | ✔️ | ✔️ | Stable installation, necessary for the implementation of the Collector (ogc_ckan) |
Extension | ckanext-dcat | 1.2.0 | Completed | ✔️ | ✔️ | Stable installation, include DCAT-AP 2.1 profile compatible with GeoDCAT-AP. |
Extension | ckanext-scheming | 3.0.0 | WIP | ✔️ | ✔️ | Stable installation. Customised ckanext schema5 based on the Spanish Metadata Core with the aim of completing the minimum metadata elements included in the current datasets in accordance with GeoDCAT-AP and INSPIRE. |
Extension | ckanext-resourcedictionary | main | Completed | ✔️ | ✔️ | Stable installation. This extension extends the default CKAN Data Dictionary functionality by adding possibility to create data dictionary before actual data is uploaded to datastore. |
Extension | ckanext-pages | 0.5.1 | Completed | ✔️ | ✔️ | Stable installation. This extension gives you an easy way to add simple pages to CKAN. |
Extension | ckanext-pdfview | 0.0.8 | Completed | ✔️ | ✔️ | Stable installation. This extension provides a view plugin for PDF files using an html object tag. |
Software | ckan-pycsw | latest | Completed | ✔️ | ✔️ | Stable installation. PyCSW Endpoint of Open Data Portal with docker compose config. Harvest the CKAN catalogue in a CSW endpoint based on existing spatial datasets in the open data portal. |
Environment: docker
docker compose vs docker-compose
All Docker Compose commands in this README will use the V2 version of Compose ie: docker compose
. The older version (V1) used the docker-compose
command. Please see Docker Compose for
more information.
Upgrade docker-engine
To upgrade Docker Engine, first run sudo apt-get update
, then follow the installation instructions, choosing the new version you want to install.
To verify a successful Docker installation, run docker run hello-world
and docker version
. These commands should output
versions for client and server.
Docker. Basic commands
Linux post-install steps
These optional post-installation procedures shows you how to configure your Linux host machine to work better with Docker. For example, managing docker with a non-root user.
Configure Docker to start on boot
sudo systemctl enable docker
# To disable this behavior, use disable instead.
sudo systemctl disable docker
Clear all Docker unused objects (images, containers, networks, local volumes)
docker system prune # Clear all
docker image prune # Clear unused images
docker container prune # Clear unused containers
docker volume prune # Clear unused volumes
docker network prune # Clear unused networks
Docker Compose. Basic commands
More info about Docker Compose commands at docker compose reference.
# Basic. All containers or specific container: <container>
## Starts existing containers for a service.
docker compose start <container>
## Restarts existing containers/container for a service.
docker compose restart <container>
## Stops running containers without removing them.
docker compose stop <container>
## Pauses running containers of a service.
docker compose pause <container>
## Unpauses paused containers of a service.
docker compose unpause <container>
# Display the logs of a container. Is it possible to retrieve only the last n seconds or other
docker logs [--since 60s] <container> -f
## Lists containers.
docker compose ps
## Remove all docker compose project
docker compose rm <container>
# Build.
## Builds, (re)creates, starts, and attaches to containers for a service.
docker compose [-f <docker compose-file>] up
## Build & up all the containers.
docker compose [-f <docker compose-file>] up -d --build
## To avoid using a cache of the previous build while creating a new image.
docker compose [-f <docker compose-file>] build --no-cache
## Build a project with a specific Docker Compose prefix.
docker compose [-f <docker compose-file>] -p <my_project> up -d --build
# Down
# Stops containers and removes containers, networks, volumes, and images created by up.
docker compose [-p <my_project>] down
Install (build and run) CKAN plus dependencies
Base mode
Use this if you are a maintainer and will not be making code changes to CKAN or to CKAN extensions.
-
Clone project
cd /path/to/my/project git clone https://github.com/mjanez/ckan-docker.git
-
Copy the
.env.example
template (or use another from/samples/
) and modify the resulting.env
to suit your needs.cp .env.example .env
-
Apache HTTP Server & CKAN/ckan-pycsw endpoints: Modifiy the variables about the site URL or locations (
CKAN_SITE_URL
CKAN_URL
,PYCSW_URL
,CKANEXT__DCAT__BASE_URI
,APACHE_SERVER_NAME
,APACHE_CKAN_LOCATION
,APACHE_PYCSW_LOCATION
, etc.). -
NGINX only CKAN: Replace the
.env
with the/samples/.env.nginx.example
and modify the variables as needed.
Note:
Please note that when accessing CKAN directly (via a browser) ie: not going through Apache/NGINX you will need to make sure you have "ckan" set up to be an alias to localhost in the local hosts file. Either that or you will need to change the.env
entry forCKAN_SITE_URL
Warning:
Using the default values on the.env
file will get you a working CKAN instance. There is a sysadmin user created by default with the values defined inCKAN_SYSADMIN_NAME
andCKAN_SYSADMIN_PASSWORD
(ckan_admin
andtest1234
by default). All ennvars withAPI_TOKEN
are automatically regenerated when CKAN is loaded, no editing is required.This should be obviously changed before running this setup as a public CKAN instance.
-
-
Build the images:
docker compose build
Note
You can use a deploy in 5 minutes if you just want to test the package. -
Start the containers:
docker compose up
This will start up the containers in the current window. By default the containers will log direct to this window with each container
using a different colour. You could also use the -d "detach mode" option ie: docker compose up -d
if you wished to use the current
window for something else.
Note
- Or
docker compose up --build
to build & up the containers.- Or
docker compose -f docker-compose.nginx.yml up -d --build
to use the NGINX version.
At the end of the container start sequence there should be 6 containers running (or 5 if use NGINX Docker Compose file)
After this step, CKAN should be running at {APACHE_SERVER_NAME
}{APACHE_CKAN_LOCATION
} and ckan-pycsw at {APACHE_SERVER_NAME
}{APACHE_PYCSW_LOCATION
}, i.e: http://localhost/catalog or http://localhost/csw
CONTAINER ID | IMAGE | COMMAND | CREATED | STATUS | PORTS | NAMES |
---|---|---|---|---|---|---|
0217537f717e | ckan-docker-apache | /docker-entrypoint.… | 6 minutes ago | Up 4 minutes | 80/tcp,0.0.0.0:80->80/tcp | apache |
7b06ab2e060a | ckan-docker-ckan | /srv/app/start_ckan… | 6 minutes ago | Up 5 minutes (healthy) | 0.0.0.0:5000->5000/tcp | ckan |
1b8d9789c29a | redis:7-alpine | docker-entrypoint.s… | 6 minutes ago | Up 4 minutes (healthy) | 6379/tcp | redis |
7f162741254d | ckan/ckan-solr:2.9-solr8-spatial | docker-entrypoint.s… | 6 minutes ago | Up 4 minutes (healthy) | 8983/tcp | solr |
2cdd25cea0de | ckan-docker-db | docker-entrypoint.s… | 6 minutes ago | Up 4 minutes (healthy) | 5432/tcp | db |
9cdj25dae6gr | ckan-docker-pycsw | docker-entrypoint.s… | 6 minutes ago | Up 4 minutes (healthy) | 8000/tcp | pycsw |
Configure a docker compose service to start on boot
Note
Test on Debian.
To have Docker Compose run automatically when you reboot a machine, you can follow the steps below:
- Create a systemd service file for Docker Compose. You can create a file named
ckan-docker-compose.service
in the/etc/systemd/system/
folder with the following content:
[Unit]
Description=CKAN Docker Compose Application Service
Requires=docker.service
After=docker.service
[Service]
User=docker
Group=docker
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/path/to/project/ckan-docker/
ExecStart=/bin/docker compose up -d
ExecStop=/bin/docker compose down
TimeoutStartSec=0
[Install]
WantedBy=multi-user.target
- Replace
/path/to/project/ckan-docker/
with the path where your project'sdocker-compose.yml
file is located and and check the path to the docker compose binary on execution and stop:/bin/docker
. Also change theUser
/Group
to execute the service. - Load the systemd service file with the following command:
sudo systemctl daemon-reload
- Enables the service to start automatically when the machine boots up:
sudo systemctl enable ckan-docker-compose
- You can now start the service with the following command:
sudo systemctl start ckan-docker-compose
- If you want to stop or check the status of the service, use the following commands:
# Stop the service
sudo systemctl stop ckan-docker-compose
# Check the status
sudo systemctl status ckan-docker-compose
Quick mode
If you just want to test the package and see the general functionality of the platform, you can use the ckan-spatial
image from the Github container registry:
cp .env.example .env
# Edit the envvars in the .env as you like and start the containers.
docker compose -f docker-compose.ghcr.yml up -d --build
It will download the pre-built image and deploy all the containers. Remember to use your own domain by changing localhost
in the .env
file.
Development mode
Use this mode if you are making code changes to CKAN and either creating new extensions or making code changes to existing extensions. This mode also uses the .env
file for config options.
To develop local extensions use the docker compose.dev.yml
file:
To build the images:
docker compose -f docker-compose.dev.yml build
To start the containers:
docker compose -f docker-compose.dev.yml up
See CKAN Images for more details of what happens when using development mode.
Create an extension
You can use the ckan extension instructions to create a CKAN extension, only executing the command inside the CKAN container and setting the mounted src/
folder as output:
docker compose -f docker compose.dev.yml exec ckan-dev /bin/sh -c "ckan generate extension --output-dir /srv/app/src_extensions"
The new extension files and directories are created in the /srv/app/src_extensions/
folder in the running container. They will also exist in the local src/ directory as local /src
directory is mounted as /srv/app/src_extensions/
on the ckan container. You might need to change the owner of its folder to have the appropiate permissions.
CKAN images
The Docker image config files used to build your CKAN project are located in the ckan/
folder. There are two Docker files:
-
Dockerfile
: this is based onckan/ckan-base-spatial:<version>
, a base image located in the Github Package Registry, that has CKAN installed along with all its dependencies, properly configured and running on uWSGI (production setup) -
Dockerfile.dev
: this is based onckan/ckan-base-spatial:<version>-dev
also located located in the Github Package Registry, and extendsckan/ckan-base-spatial:<version>
to include:- Any extension cloned on the
src
folder will be installed in the CKAN container when booting up Docker Compose (docker compose up
). This includes installing any requirements listed in arequirements.txt
(orpip-requirements.txt
) file and runningpython setup.py develop
. - CKAN is started running this:
/usr/bin/ckan -c /srv/app/ckan.ini run -H 0.0.0.0
. - Make sure to add the local plugins to the
CKAN__PLUGINS
env var in the.env
file.
- Any extension cloned on the
CKAN images enhancement
Extending the base images
You can modify the docker files to build your own customized image tailored to your project, installing any extensions and extra requirements needed. For example here is where you would update to use a different CKAN base image ie: ckan/ckan-base-spatial:<new version>
To perform extra initialization steps you can add scripts to your custom images and copy them to the /docker-entrypoint.d
folder (The folder should be created for you when you build the image). Any *.sh
and *.py
file in that folder will be executed just after the main initialization script (prerun.py
) is executed and just before the web server and supervisor processes are started.
For instance, consider the following custom image:
ckan
├── docker-entrypoint.d
│ └── setup_validation.sh
├── Dockerfile
└── Dockerfile.dev
We want to install an extension like ckanext-validation that needs to create database tables on startup time. We create a setup_validation.sh
script in a docker-entrypoint.d
folder with the necessary commands:
#!/bin/bash
# Create DB tables if not there
ckan -c /srv/app/ckan.ini validation init-db
And then in our Dockerfile.dev
file we install the extension and copy the initialization scripts:
FROM ckan/ckan-base-spatial:2.9.8
RUN pip install -e git+https://github.com/frictionlessdata/ckanext-validation.git#egg=ckanext-validation && \
pip install -r https://raw.githubusercontent.com/frictionlessdata/ckanext-validation/master/requirements.txt
COPY docker-entrypoint.d/* /docker-entrypoint.d/
NB: There are a number of extension examples commented out in the Dockerfile.dev file
Applying patches
When building your project specific CKAN images (the ones defined in the ckan/
folder), you can apply patches
to CKAN core or any of the built extensions. To do so create a folder inside ckan/patches
with the name of the
package to patch (ie ckan
or ckanext-??
). Inside you can place patch files that will be applied when building
the images. The patches will be applied in alphabetical order, so you can prefix them sequentially if necessary.
For instance, check the following example image folder:
ckan
├── patches
│ ├── ckan
│ │ ├── 01_datasets_per_page.patch
│ │ ├── 02_groups_per_page.patch
│ │ ├── 03_or_filters.patch
│ └── ckanext-harvest
│ └── 01_resubmit_objects.patch
├── setup
├── Dockerfile
└── Dockerfile.dev
Note:
Git diff is a command to output the changes between two sources inside the Git repository. The data sources can be two different branches, commits, files, etc.
- Show changes between working directory and staging area:
git diff > [file.patch]
- Shows any changes between the staging area and the repository:
git diff --staged [file]
ckan-docker addons
VSCode dev containers
The Visual Studio Code Dev Containers extension is a powerful tool that enables developers to use a container as a complete development environment. With this extension, developers can open any folder inside a container and take advantage of the full range of features provided by Visual Studio Code. To do this, developers create a devcontainer.json
file in their project that specifies how to access or create a development container with a predefined tool and runtime stack. This allows developers to work in an isolated environment, ensuring that the development environment is consistent across team members and that project dependencies are easy to manage.
-
Install VSCode.
-
Install the Remote Development extension for VSCode.
-
In your project directory, create a file named
devcontainer.json.
This file will contain the configuration for yourdev container
. -
In the
devcontainer.json
file, specify the Docker image that you want to use for yourdev container
. -
Specify any additional configuration settings for your
dev container
, such as environment variables, ports to expose, and startup commands. -
Open your project in a
dev container
by using the Remote Development extension in VSCode. You can do this by clicking theOpen Folder in Container
button in the command palette or by opening the folder using theRemote-Containers: Open Folder in Container
command. Also you can attach to an active containerAttach to Running Container
. -
VSCode will start a new container based on the configuration settings in your
devcontainer.json
file. Once the container is started, you can work on your project just like you would on your local machine.
pdb
Add these lines to the ckan-dev
service in the docker compose.dev.yml file
Debug with pdb (example) - Interact with docker attach $(docker container ls -qf name=ckan)
command: python -m pdb /usr/lib/ckan/venv/bin/ckan --config /srv/app/ckan.ini run --host 0.0.0.0 --passthrough-errors
Datastore
The Datastore database and user is created as part of the entrypoint scripts for the db container.
Apache HTTP Server
The default Docker Compose configuration (docker-compose.yml
) uses an httpd image as the front-end. It has two routes for the ckan (default location: /catalog
) and ckan-pycsw (default location: /csw
) services.
Both web locations can be modified in the .env
file:
...
# Apache HTTP Server
APACHE_VERSION=2.4
APACHE_PORT=80
APACHE_LOG_DIR=/var/log/apache
APACHE_SERVER_NAME=mjanez-cautious-lamp-4pjq9vpg967hq447-80.preview.app.github.dev
# Check CKAN__ROOT_PATH and CKANEXT__DCAT__BASE_URI. If you don't need to use domain locations, it is better to use the nginx configuration. Leave blank or use the root `/`.
APACHE_CKAN_LOCATION=/catalog
APACHE_PYCSW_LOCATION=/csw
...
NGINX
Warning
The nginx docker compose file only deploys the CKAN service, not ckan-pycsw.
The nginx Docker Compose configuration (docker-compose.nginx.yml
) uses an NGINX image as the front-end (ie: reverse proxy). It includes HTTPS running on port number 8443 and an HTTP port (81). A "self-signed" SSL certificate is generated beforehand and the server certificate and key files are included. The NGINX server_name
directive and the CN
field in the SSL certificate have been both set to 'localhost'. This should obviously not be used for production.
Creating the SSL cert and key files as follows:
openssl req -new -newkey rsa:4096 -days 365 -nodes -x509 -subj "/C=DE/ST=Berlin/L=Berlin/O=None/CN=localhost" -keyout ckan-local.key -out ckan-local.crt
The ckan-local.*
files will then need to be moved into the nginx/setup/ directory
envvars
The ckanext-envvars extension is used in the CKAN Docker base repo to build the base images. This extension checks for environmental variables conforming to an expected format and updates the corresponding CKAN config settings with its value.
For the extension to correctly identify which env var keys map to the format used for the config object, env var keys should be formatted in the following way:
All uppercase Replace periods ('.') with two underscores ('__') Keys must begin with 'CKAN' or 'CKANEXT'
For example:
CKAN__PLUGINS="envvars image_view text_view recline_view datastore datapusher"
CKAN__DATAPUSHER__CALLBACK_URL_BASE=http://ckan:5000
These parameters can be added to the .env
file
For more information please see ckanext-envvars
ckan-pycsw
ckan-pycsw is a docker compose environment (based on pycsw) for development and testing with CKAN Open Data portals.5
Available components:
- pycsw: The pycsw app. An OARec and OGC CSW server implementation written in Python.
- ckan2pycsw: Software to achieve interoperability with the open data portals based on CKAN. To do this, ckan2pycsw reads data from an instance using the CKAN API, generates ISO-19115/ISO-19139 metadata using pygeometa, or a custom schema that is based on a customized CKAN schema, and populates a pycsw instance that exposes the metadata using CSW and OAI-PMH.
-
Official CKAN repo: https://github.com/ckan/ckan-docker-base ↩︎
-
Contains fields needed for the ckanext-spatial geo search ↩︎
-
Development environment. ↩︎
-
Production environment. ↩︎
-
ckan_geodcatap, more info: https://github.com/mjanez/ckanext-scheming/pull/1 ↩︎