292 lines
12 KiB
ReStructuredText
292 lines
12 KiB
ReStructuredText
==============
|
|
Spatial Search
|
|
==============
|
|
|
|
The spatial extension allows to index datasets with spatial information so they
|
|
can be filtered via a spatial query. This includes both via the web interface
|
|
(see the `Spatial Search Widget`_) or via the `action API`_, e.g.::
|
|
|
|
POST http://localhost:5000/api/action/package_search
|
|
{ "q": "Pollution",
|
|
"facet": "true",
|
|
"facet.field": "country",
|
|
"extras": {
|
|
"ext_bbox": "-7.535093,49.208494,3.890688,57.372349" }
|
|
}
|
|
|
|
.. versionchanged:: 2.0.1
|
|
Starting from this version the spatial filter it is also supported on GET
|
|
requests:
|
|
|
|
http://localhost:5000/api/action/package_search?q=Pollution&ext_bbox=-7.535093,49.208494,3.890688,57.372349
|
|
|
|
|
|
Setup
|
|
-----
|
|
|
|
To enable the spatial query you need to add the ``spatial_query`` plugin to
|
|
your ini file. This plugin requires the ``spatial_metadata`` plugin, eg::
|
|
|
|
ckan.plugins = [other plugins] spatial_metadata spatial_query
|
|
|
|
To define which backend to use for the spatial search use the following
|
|
configuration option (see `Choosing a backend for the spatial search`_)::
|
|
|
|
ckanext.spatial.search_backend = solr
|
|
|
|
|
|
Geo-Indexing your datasets
|
|
--------------------------
|
|
|
|
Regardless of the backend that you are using, in order to make a dataset
|
|
queryable by location, an special extra must be defined, with its key named
|
|
'spatial'. The value must be a valid GeoJSON_ geometry, for example::
|
|
|
|
{
|
|
"type":"Polygon",
|
|
"coordinates":[[[2.05827, 49.8625],[2.05827, 55.7447], [-6.41736, 55.7447], [-6.41736, 49.8625], [2.05827, 49.8625]]]
|
|
}
|
|
|
|
or::
|
|
|
|
{
|
|
"type": "Point",
|
|
"coordinates": [-3.145,53.078]
|
|
}
|
|
|
|
|
|
Every time a dataset is created, updated or deleted, the extension will
|
|
synchronize the information stored in the extra with the geometry table.
|
|
|
|
Choosing a backend for the spatial search
|
|
+++++++++++++++++++++++++++++++++++++++++
|
|
|
|
There are different backends supported for the spatial search, it is important
|
|
to understand their differences and the necessary setup required when choosing
|
|
which one to use.
|
|
|
|
The following table summarizes the different spatial search backends:
|
|
|
|
+------------------------+---------------+-------------------------------------+-----------------------------------------------------------+-------------------------------------------+
|
|
| Backend | Solr Versions | Supported geometries | Sorting and relevance | Performance with large number of datasets |
|
|
+========================+===============+=====================================+===========================================================+===========================================+
|
|
| ``solr`` | 3.1 to 4.x | Bounding Box | Yes, spatial sorting combined with other query parameters | Good |
|
|
+------------------------+---------------+-------------------------------------+-----------------------------------------------------------+-------------------------------------------+
|
|
| ``solr-spatial-field`` | 4.x | Bounding Box, Point and Polygon [1] | Not implemented | Good |
|
|
+------------------------+---------------+-------------------------------------+-----------------------------------------------------------+-------------------------------------------+
|
|
| ``postgis`` | 1.3 to 4.x | Bounding Box | Partial, only spatial sorting supported [2] | Poor |
|
|
+------------------------+---------------+-------------------------------------+-----------------------------------------------------------+-------------------------------------------+
|
|
|
|
|
|
[1] Requires JTS
|
|
|
|
[2] Needs ``ckanext.spatial.use_postgis_sorting`` set to True
|
|
|
|
|
|
|
|
We recommend to use the ``solr`` backend whenever possible. Here are more
|
|
details about the available options:
|
|
|
|
* ``solr`` (Recommended)
|
|
This option uses normal Solr fields to index the relevant bits of
|
|
information about the geometry and uses an algorithm function to sort
|
|
results by relevance, keeping any other non-spatial filtering. It only
|
|
supports bounding boxes both for the geometries to be indexed and the
|
|
input query shape. It requires `EDisMax`_ query parser, so it will only
|
|
work on versions of Solr greater than 3.1 (We recommend using Solr 4.x).
|
|
|
|
You will need to add the following fields to your Solr schema file to
|
|
enable it::
|
|
|
|
<fields>
|
|
<!-- ... -->
|
|
<field name="bbox_area" type="float" indexed="true" stored="true" />
|
|
<field name="maxx" type="float" indexed="true" stored="true" />
|
|
<field name="maxy" type="float" indexed="true" stored="true" />
|
|
<field name="minx" type="float" indexed="true" stored="true" />
|
|
<field name="miny" type="float" indexed="true" stored="true" />
|
|
</fields>
|
|
|
|
|
|
* ``solr-spatial-field``
|
|
This option uses the `spatial field`_ introduced in Solr 4, which allows
|
|
to index points, rectangles and more complex geometries (complex geometries
|
|
will require `JTS`_, check the documentation).
|
|
Sorting has not yet been implemented, users willing to do so will need to
|
|
modify the query using the ``before_search`` extension point.
|
|
|
|
You will need to add the following field type and field to your Solr
|
|
schema file to enable it (Check the `Solr documentation`__ for more
|
|
information on the different parameters, note that you don't need
|
|
``spatialContextFactory`` if you are not using JTS)::
|
|
|
|
<types>
|
|
<!-- ... -->
|
|
<fieldType name="location_rpt" class="solr.SpatialRecursivePrefixTreeFieldType"
|
|
spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory"
|
|
distErrPct="0.025"
|
|
maxDistErr="0.000009"
|
|
units="degrees" />
|
|
</types>
|
|
<fields>
|
|
<!-- ... -->
|
|
<field name="spatial_geom" type="location_rpt" indexed="true" stored="true" multiValued="true" />
|
|
</fields>
|
|
|
|
* ``postgis``
|
|
This is the original implementation of the spatial search. It
|
|
does not require any change in the Solr schema and can run on Solr 1.x,
|
|
but it is not as efficient as the previous ones. Basically the bounding
|
|
box based query is performed in PostGIS first, and the ids of the matched
|
|
datasets are added as a filter to the Solr request. This, apart from being
|
|
much less efficient, can led to issues on Solr due to size of the requests
|
|
(See `Solr configuration issues on legacy PostGIS backend`_). There is
|
|
support for a spatial ranking on this backend (setting
|
|
``ckanext.spatial.use_postgis_sorting`` to True on the ini file), but
|
|
it can not be combined with any other filtering.
|
|
|
|
|
|
Spatial Search Widget
|
|
---------------------
|
|
|
|
|
|
.. image:: _static/spatial-search-widget.png
|
|
|
|
The extension provides a snippet to add a map widget to the search form, which
|
|
allows filtering results by an area of interest.
|
|
|
|
To add the map widget to the to the sidebar of the search page, add this to the
|
|
dataset search page template
|
|
(``myproj/ckanext/myproj/templates/package/search.html``)::
|
|
|
|
{% block secondary_content %}
|
|
|
|
{% snippet "spatial/snippets/spatial_query.html" %}
|
|
|
|
{% endblock %}
|
|
|
|
By default the map widget will show the whole world. If you want to set up a
|
|
different default extent, you can pass an extra ``default_extent`` to the
|
|
snippet, either with a pair of coordinates like this::
|
|
|
|
{% snippet "spatial/snippets/spatial_query.html", default_extent="[[15.62,
|
|
-139.21], [64.92, -61.87]]" %}
|
|
|
|
or with a GeoJSON object describing a bounding box (note the escaped quotes)::
|
|
|
|
{% snippet "spatial/snippets/spatial_query.html", default_extent="{ \"type\":
|
|
\"Polygon\", \"coordinates\": [[[74.89, 29.39],[74.89, 38.45], [60.50,
|
|
38.45], [60.50, 29.39], [74.89, 29.39]]]}" %}
|
|
|
|
You need to load the ``spatial_metadata`` and ``spatial_query`` plugins to use this
|
|
snippet.
|
|
|
|
|
|
|
|
Dataset Extent Map
|
|
------------------
|
|
|
|
.. image:: _static/dataset-extent-map.png
|
|
|
|
Using the snippets provided, if datasets contain a ``spatial`` extra like the
|
|
one described in the previous section, a map will be shown on the dataset
|
|
details page.
|
|
|
|
There are snippets already created to load the map on the left sidebar or in
|
|
the main body of the dataset details page, but these can be easily modified to
|
|
suit your project needs
|
|
|
|
To add a map to the sidebar, add this to the dataset details page template (eg
|
|
``ckanext-myproj/ckanext/myproj/templates/package/read.html``)::
|
|
|
|
{% block secondary_content %}
|
|
{{ super() }}
|
|
|
|
{% set dataset_extent = h.get_pkg_dict_extra(c.pkg_dict, 'spatial', '') %}
|
|
{% if dataset_extent %}
|
|
{% snippet "spatial/snippets/dataset_map_sidebar.html", extent=dataset_extent %}
|
|
{% endif %}
|
|
|
|
{% endblock %}
|
|
|
|
For adding the map to the main body, add this::
|
|
|
|
{% block primary_content_inner %}
|
|
|
|
{{ super() }}
|
|
|
|
{% set dataset_extent = h.get_pkg_dict_extra(c.pkg_dict, 'spatial', '') %}
|
|
{% if dataset_extent %}
|
|
{% snippet "spatial/snippets/dataset_map.html", extent=dataset_extent %}
|
|
{% endif %}
|
|
|
|
{% endblock %}
|
|
|
|
You need to load the ``spatial_metadata`` plugin to use these snippets.
|
|
|
|
Legacy Search
|
|
-------------
|
|
|
|
Solr configuration issues on legacy PostGIS backend
|
|
+++++++++++++++++++++++++++++++++++++++++++++++++++
|
|
|
|
.. warning::
|
|
|
|
If you find any of the issues described in this section it is strongly
|
|
recommended that you consider switching to one of the Solr based backends
|
|
which are much more efficient. These notes are just kept for informative
|
|
purposes.
|
|
|
|
|
|
If using Spatial Query functionality then there is an additional SOLR/Lucene
|
|
setting that should be used to set the limit on number of datasets searchable
|
|
with a spatial value.
|
|
|
|
The setting is ``maxBooleanClauses`` in the solrconfig.xml and the value is the
|
|
number of datasets spatially searchable. The default is ``1024`` and this could
|
|
be increased to say ``16384``. For a SOLR single core this will probably be at
|
|
`/etc/solr/conf/solrconfig.xml`. For a multiple core set-up, there will me
|
|
several solrconfig.xml files a couple of levels below `/etc/solr`. For that
|
|
case, *all* of the cores' `solrconfig.xml` should have this setting at the new
|
|
value.
|
|
|
|
Example::
|
|
|
|
<maxBooleanClauses>16384</maxBooleanClauses>
|
|
|
|
This setting is needed because PostGIS spatial query results are fed into SOLR
|
|
using a Boolean expression, and the parser for that has a limit. So if your
|
|
spatial area contains more than the limit (of which the default is 1024) then
|
|
you will get this error::
|
|
|
|
Dataset search error: ('SOLR returned an error running query...
|
|
|
|
and in the SOLR logs you see::
|
|
|
|
too many boolean clauses ... Caused by:
|
|
org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to
|
|
1024
|
|
|
|
|
|
Legacy API
|
|
++++++++++
|
|
|
|
The extension adds the following call to the CKAN search API, which returns
|
|
datasets with an extent that intersects with the bounding box provided::
|
|
|
|
/api/2/search/dataset/geo?bbox={minx,miny,maxx,maxy}[&crs={srid}]
|
|
|
|
If the bounding box coordinates are not in the same projection as the one
|
|
defined in the database, a CRS must be provided, in one of the following forms:
|
|
|
|
- `urn:ogc:def:crs:EPSG::4326`
|
|
- EPSG:4326
|
|
- 4326
|
|
|
|
.. _action API: http://docs.ckan.org/en/latest/apiv3.html
|
|
.. _edismax: http://wiki.apache.org/solr/ExtendedDisMax
|
|
.. _JTS: http://www.vividsolutions.com/jts/JTSHome.htm
|
|
.. _spatial field: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
|
|
__ `spatial field`_
|
|
.. _GeoJSON: http://geojson.org
|