CKAN extension to integrate Google Analytics data into CKAN. Gives download stats on package pages, list of most popular packages, etc.
Go to file
amercader 02644f1d2d Merge branch 'master' into enhancement-2251-tracking 2012-05-10 15:56:40 +01:00
ckanext Merge branch 'master' into enhancement-2251-tracking 2012-05-10 15:56:40 +01:00
tests [1451] Track downloads on resource view page. 2011-12-20 10:56:05 +00:00
.gitignore change .hgignore to .gitignore 2011-11-29 11:02:09 +00:00
README.rst [doc] Update README 2012-05-10 13:43:36 +01:00
setup.py factor out database init as separate install step 2011-04-07 10:24:22 +01:00

README.rst

CKAN Google Analytics Extension

Status: Production

CKAN Version: 1.5.*

Overview

A CKAN extension for doing things with Google Analytics:

  • It puts the relevant tracking code in your templates for you (including tracking code for external resource download links)
  • It provides a page showing top packages and resources
  • It inserts download stats onto individual package pages

Installation

  1. Install the extension as usual, e.g. (from an activated virtualenv):

    $ pip install -e  git+https://github.com/okfn/ckanext-googleanalytics.git#egg=ckanext-googleanalytics
  2. Edit your development.ini (or similar) to provide these necessary parameters:

    googleanalytics.id = UA-1010101-1
    googleanalytics.username = googleaccount@gmail.com
    googleanalytics.password = googlepassword

    Note that your password will probably be readable by other people; so you may want to set up a new gmail account specifically for accessing your gmail profile.

  3. Run the following command from src/ckanext-googleanalytics to set up the required database tables (of course, altering the --config option to point to your site config file):

    paster initdb --config=../ckan/development.ini
  4. Edit again your configuration ini file to activate the extension with:

    ckan.plugins = googleanalytics

    (If there are other plugins activated, add this to the list. Each plugin should be separated with a space)

    Finally, there are some optional configuration settings (shown here with their default settings):

    googleanalytics.show_downloads = true
    googleanalytics.resource_prefix = /downloads/
    googleanalytics.domain = auto

    If show_downloads is set, a download count for resources will be displayed on individual package pages.

    resource_prefix is an arbitrary identifier so that we can query for downloads in Google Analytics. It can theoretically be any string, but should ideally resemble a URL path segment, to make filtering for all resources easier in the Google Analytics web interface.

    domain allows you to specify a domain against which Analytics will track users. You will usually want to leave this as auto; if you are tracking users from multiple subdomains, you might want to specify something like .mydomain.com. See Google's documentation for more info.

  5. Restart CKAN (e.g. by restarting Apache)

  6. Wait a while for some stats to be recorded in Google

  7. Import Google stats by running the following command from src/ckanext-googleanalytics:

    paster loadanalytics --config=../ckan/development.ini

    (Of course, pointing config at your specific site config)

  8. Look at some stats within CKAN

    Once your GA account has gathered some data, you can see some basic information about the most popular packages at: http://mydomain.com/analytics/dataset/top

    By default the only data that is injected into the public-facing website is on the package page, where number of downloads are displayed next to each resource.

  9. Consider running the import command reguarly as a cron job, or remember to run it by hand, or your statistics won't get updated.

Testing

There are some very high-level functional tests that you can run using:

(pyenv)~/pyenv/src/ckan$ nosetests --ckan ../ckanext-googleanalytics/tests/

(note -- that's run from the CKAN software root, not the extension root)

Future

This is a bare-bones, first release of the software. There are several directions it could take in the future.

Because we use Google Analytics for recording statistics, we can hook into any of its features. For example, as a measure of popularity, we could record bounce rate, or new visits only; we could also display which datasets are popular where, or highlight packages that have been linked to from other locations.

We could also embed extra metadata information in tracking links, to enable reports on particular types of data (e.g. most popular data format by country of origin, or most downloaded resource by license)