Commit Graph

754 Commits

Author SHA1 Message Date
David Read 46f7b32b04 Merge branch 'master' of github.com:okfn/ckanext-harvest into migration-states 2015-07-22 10:13:55 +01:00
David Read 2da918c2e4 Fix migration for old harvests so that ones that errored are correctly marked. Added helpful comments in model. 2015-07-22 10:13:02 +01:00
Stefan Oderbolz ab76830e85 [#145] Throw + catch a custom exception if there are no jobs to run
If there are no harvesting jobs to run, there was always an ugly
exception message when using the paster command. This replaces the ugly
output with a proper message and uses a custom exception to allow others
to deal with this error differently.
2015-07-20 18:41:50 +02:00
Stefan Oderbolz 83dd0b4b68 [#138] Add data attributes to support timezone conversion 2015-07-09 22:35:54 +02:00
Stefan Oderbolz 4dc2f7367d [#139] Delete package relationships when clearing a harvest source 2015-06-26 17:20:23 +02:00
amercader 88d9ba0397 [#136] Fix broken RabbitMQ queue names
The harvester command was still using the old ones.
Use specific ones for testing.
2015-06-11 13:56:22 +01:00
amercader 673dfc9882 [#127] Use site user on the CKAN harvester
Add missing call
2015-06-11 10:38:33 +01:00
amercader d3a3f09ad1 [#127] Use site user on the CKAN harvester
To avoid having to create a 'harvest' sysadmin explicitly. It will still
be used if present, but if not the site user will be used. You can also
define to user to use via a config option.
2015-06-11 10:19:07 +01:00
amercader b17c3269b5 Merge branch 'clear-command' of https://github.com/metaodi/ckanext-harvest into metaodi-clear-command 2015-06-10 15:32:37 +01:00
Stefan Oderbolz 64ff0f3a3a Use single quotes to be consistent 2015-06-10 16:22:04 +02:00
Stefan Oderbolz 2a2d85f60c Wording changes for clearsource and rmsource 2015-06-10 16:19:23 +02:00
joetsoi 92b93c53fc add some translation strings 2015-06-10 12:14:20 +01:00
Stefan Oderbolz 8ebb843052 Add documentation for clearsource command 2015-06-10 11:29:24 +02:00
Stefan Oderbolz 61bc150ae6 Expose clear harvester source as a paster command 2015-06-10 11:19:10 +02:00
amercader 9f8aae3a18 Append site id to queue name
This allows multiple CKAN sites to share the same RabbitMQ exchange
(For the Redis backend this is handled via different Redis databases)
2015-06-01 17:54:22 +01:00
amercader 3e21ea4f82 Fix tests, set up Travis
TODO: sort out the tests properly, avoiding imports from the legacy ones
2015-04-07 13:31:45 +01:00
amercader f72d6da521 Change toolkit import
Apparently on package installs this is not well supported

from ckan.plugins.toolkit import check_ckan_version

But this works:

from ckan.plugins import toolkit

toolkit.check_ckan_version(...
2015-03-19 12:48:46 +00:00
amercader 7a20e93716 Raise on startup import errors so we don't mask problems
Otherwise if there was eg an actual ImportError we jut got

2015-03-19 12:30:08,430 DEBUG [ckanext.harvest.plugin] No auth module
for action "update"

on the log
2015-03-19 12:48:15 +00:00
Jari Voutilainen 859133fe36 move detecting unchanged datasets to ckanharvester and queue.py 2015-03-10 14:48:41 +02:00
David Read d6e9b80496 Merge pull request #118 from clementmouchet/114-remove_resource_groups
Removed ResourceGroup from query when using CKAN 2.3 or above
2015-02-24 09:56:44 +00:00
clementmouchet ead9e67a33 updated def harvest_source_clear() to delete resource views, resource revisions & resources in CKAN >= 2.3 2015-02-23 17:02:21 +00:00
David Read b3ed6cae5a Merge pull request #121 from metaodi/120-create-remote-orgs
Fetch remote organization via action api
2015-01-15 10:49:09 +00:00
Stefan Oderbolz c1bcee9684 Use str() to get the error message 2015-01-15 11:36:15 +01:00
Stefan Oderbolz 191c39ce5c Catch the more general URLError instead of HTTPError
HTTPError is a subclass of URLError, so catch URLError is enough. I
think the HTTP error code is not as important in this situation, so
catching the more generic error seems like the best solution.
2015-01-15 10:57:24 +01:00
Stefan Oderbolz b978c26e70 Use ContentFetchError instead of generic Exception 2015-01-15 00:49:11 +01:00
Stefan Oderbolz 935b9dda01 Munge group name before fetching remote group
The API call /api/2/rest/package/<id> returns the display name of the
group instead of its ID. To properly match the group, munge the name
before calling /api/2/rest/group
2015-01-15 00:44:53 +01:00
Stefan Oderbolz ef35c21e2a Improve exception handling with custom exception
1. Try whenever possible to catch specific exceptions
2. Raise custom exception where appropriate
3. Fix the exception handling in _get_group and _get_organization
2015-01-15 00:44:45 +01:00
Stefan Oderbolz 0fd38e0e54 Use _get_group as a fallback for remote orgs
First try to get a remote org from the remote Action API, if this fails
try to use the old rest api call, which works on older CKAN versions.

Only if both options fail, its currently not possible to get the remote
organization.
2015-01-14 00:10:27 +01:00
Stefan Oderbolz f214577872 Fetch remote organization via action api
Organizations used to be returned by /api/2/rest/group, this is what the
old implementation used to fetch the information to create the remote
organization on the local instance of CKAN.

With this commit the Action API is used to fetch the same information.
2015-01-13 14:46:53 +01:00
Stefan Oderbolz ea9debf714 Fix logic of conditional and make it more pythonic 2014-12-18 16:03:33 +01:00
Stefan Oderbolz 08930d01bf Make sure for new packages get a unique 'name' 2014-12-16 15:02:36 +01:00
clementmouchet 82c7988bf3 Removed ResourceGroup from query when using CKAN 2.3 or above 2014-12-12 13:10:40 +00:00
amercader a3affc9702 Fix validators on harvest_source_show schema
Remove validators on several keys so they don't get stripped during the
show validation.
2014-10-08 12:02:26 +01:00
amercader 098b54f1e5 Merge branch 'clear-source-delete-related' of https://github.com/waldvogel/ckanext-harvest into waldvogel-clear-source-delete-related 2014-09-29 13:49:19 +01:00
amercader e60e2eee03 Fix output for harvest_source_create/update
They were using an incorrect schema, so not returning a harvest source
like dict.
2014-09-29 12:43:37 +01:00
waldvogel c9b4e10506 delete records from related and related_dataset when clearing source 2014-09-12 10:56:37 +02:00
Jari Voutilainen 1e0376cff6 fix typo 2014-09-10 10:33:13 +03:00
Jari Voutilainen f6c1456abe fix job reporting to have job finished timestamp when there was zero datasets to gather 2014-09-10 09:22:55 +03:00
Jari Voutilainen 97f09913cf fix job reporting all datasets deleted when actually nothing changed during last two harvests 2014-09-10 09:22:44 +03:00
amercader 8cf254f112 Merge branch '99-all-non-ascii-tags' of https://github.com/morty/ckanext-harvest into morty-99-all-non-ascii-tags 2014-08-29 14:40:43 +01:00
amercader 546159744e Merge branch '101-modified-package-name' of https://github.com/morty/ckanext-harvest into morty-101-modified-package-name 2014-08-29 14:38:33 +01:00
amercader 039ac7c0ad Always remove harvest extras on after_show if there
Up until now we where relying on `for_edit` being present in the
context, but this is only added on the controllers. It's better to be
safe and remove them always. If needed (at index time) they will be
added afterwards.
2014-08-14 15:31:39 +01:00
Tom Mortimer-Jones 8a2c072d4e [101] Use name from database when reharvesting package 2014-08-12 11:18:48 +01:00
Tom Mortimer-Jones 65cfade420 [99] Remove empty tags produced by munging all non-ascii tags
I thought this way of filtering was easier to read than filter(None, tags)
2014-08-07 17:05:16 +01:00
amercader 13dbb1eea4 Fix variable not defined 2014-07-30 15:49:02 +01:00
amercader 58a873ac7a [#91] Remove config fields from source dict before indexing
We don't need them and will avoid indexing errors
2014-06-27 16:54:39 +01:00
amercader a59ab4b5ff [#91] Consolidate all harvest source reindex code in a single action
Make it available to users with permissions on the harvest source
2014-06-27 16:48:14 +01:00
amercader 7459358fa1 Support for single import commands
We are now able to run `paster harvester import` for a single harvest
object or for a single dataset, providing ids or name.
2014-05-15 16:30:30 +01:00
amercader 2c6aaf5bb1 Merge branch 'master' into 96-harvest-object-encoding-errors 2014-05-15 15:52:13 +01:00
amercader 43f1d08255 [#97] Persitent endpoint for datasets harvest objects
Contrary to `/harvest/object/xxx`, this endpoint is passed the dataset
id, thus it not depends on a particular object but the most recent one.
2014-04-30 17:45:07 +01:00
amercader 1b458b1772 [#96] Handle encoding errors on harvest object endpoint
When parsing the harvest object content to see if it is an XML file,
etree.fromstring would fail id there are incorrect unicode errors.
2014-04-28 12:48:09 +01:00
Richard Claydon e3492b57e7 Update plugin.py
Updating plugin.py to check for the existence of the extras key in the data_dict.
2014-02-27 16:05:39 +00:00
amercader d3cf5e58d1 [#86] Fix duplicate extras 2014-02-11 18:16:49 +00:00
amercader fbde0b8dc1 [#87] Remove remote url_type from resources
Otherwise CKAN thinks they are uploads, datastore resources, etc, which
it can cause problems eg when displaying the URL of the resource. We
are just linking to the remote resource URL.
2014-02-11 17:27:19 +00:00
amercader 5739e541d7 [#80] Support for Python 2.6 when handling xml exceptions 2014-02-10 18:44:46 +00:00
amercader 2a07a144fc [#84] Fix auth audit exception when creating datasets
This was caused by a combination of the auth audit leaking and the
harvester reusing the context for the package_show and package_create
actions. If the package is not found, package_show does not call
check_access, and the auth audit does not pass. This is stored in the
context (`__auth_audit`) and is raised next time that we call
get_action (when we call package_create with the same context)

It could potentially be fixed on master, but it is probably quite rare.
2014-02-10 18:22:48 +00:00
amercader 5b677b6099 [#83] Fix key error when using default_groups 2014-02-10 13:16:58 +00:00
Rachel Knowler bf11e4d330 Moved clean_tags check into _create_or_update_package method. 2014-02-10 09:29:01 +01:00
Rachel Knowler 2ba9908653 Config option to munge tags changed to be consistent with other config options in this extension, and noted in README. 2014-01-29 10:55:51 +01:00
Rachel Knowler 5e1aef1d08 Removed extra newline. 2014-01-29 10:06:32 +01:00
Rachel Knowler 7d71b0a00b Wrap tag munging code in config option, defaulting to False. 2014-01-29 10:02:16 +01:00
amercader 2b803a3f66 [#77] Use auth_allow_anonymous_access decorator
Starting from 2.2 you need to explicitly flag auth functions that
allow anonymous access with the p.toolkit.auth_allow_anonymous_access
decorator. A local version of the decorator is used to ensure we only
use it on CKAN>=2.2
2014-01-20 13:47:37 +00:00
amercader 4cc56f51ab [#76] Use harvest_source_show on reindex command 2014-01-14 17:04:34 +00:00
amercader 95d0ef0f01 [#76] Add extra fields to the source schema
Add 'private' and its core validators, and 'metadata_modified' and
'metadata_created'.

Also ignore '__extras'
2014-01-14 17:01:25 +00:00
amercader 467fb7bb8f Fix resource updating for harvested datasets
Starting from 2.2, resource_update calls package_show before updating
the resource via a package_update call. The dict passed had the harvest
extras (eg harvest_object_id) added which made the update call fails due
to duplicated extra keys. To fix it we now remove any harvest extras
on after_show if there is a 'for_edit' property on the context.
2014-01-13 10:30:52 +00:00
amercader 278a8e1ada Merge branch 'master' of github.com:okfn/ckanext-harvest 2014-01-10 13:49:38 +00:00
amercader 1e94a11255 [#70] Fix Add harvest source button not showing
Due to changes in the templates starting on 2.1 the add source button
was not showing. The whole search template has been simplified,
separating in a separate file the 2.0 only code.

Tested in 2.0, 2.1 and 2.2
2014-01-10 13:48:02 +00:00
Mikko Koho 51e842ee6e Add quotes to harvest_source_id in Solr query when clearing harvest sources 2013-11-22 11:01:26 +02:00
Mikko Koho c338452872 Put harvest_source_id in quotes to prevent Solr errors 2013-11-21 14:14:56 +02:00
amercader 928ea061aa Improve organizations dropdown on source form 2013-10-24 12:33:44 +01:00
Stefan Oderbolz c52085006a [#61] Truly ignore harvest sources
The currently implementation returns False when a harvest source is being harvested. This leads to an error on the harvesting job, which in turn tends to confuse users that have no idea of this special implementation. This fix ensures that harvest sources are still ignored, but silently.
2013-10-23 07:40:55 +02:00
amercader c18d9dc3af [#71] CKAN harvester: Add datasets to source organization
If the harvest source belongs to an organization, new datasets should be added
to it. This is already the case in the spatial harvesters.

The remote orgs logic has been kept, with the difference that if for
some reason the remote org can not be assigned, the local one is used.

If the source does not have an organization, none is added.
2013-10-22 16:24:43 +01:00
amercader 380c14c22c Fix CLI sources list output 2013-10-16 13:03:32 +01:00
amercader 55d2b4e304 Fix purge command 2013-10-16 12:59:23 +01:00
amercader bd62b62764 Merge branch 'metaodi-add-harvesting-of-organizations' 2013-10-15 17:50:04 +01:00
amercader 49999893e7 Merge branch 'add-harvesting-of-organizations' of git://github.com/metaodi/ckanext-harvest into metaodi-add-harvesting-of-organizations 2013-10-15 17:49:28 +01:00
Stefan Oderbolz 8b5d70c6fe Only try to create/match a organization if there is a remote_org 2013-10-11 18:08:32 +02:00
amercader 0f5624822c Use remote name if present when creating datasets on CKAN harvester 2013-10-11 16:50:25 +01:00
amercader 340e9eed63 Merge branch 'add-harvesting-of-organizations' of git://github.com/metaodi/ckanext-harvest into metaodi-add-harvesting-of-organizations 2013-10-11 16:14:18 +01:00
amercader e9dde3f48a Only show the 'Add Harvest Source' button if user is authorized 2013-10-11 11:55:14 +01:00
Stefan Oderbolz dd1acd0c6b Use remote_orgs for organizations 2013-10-07 11:22:19 +02:00
amercader f89f12203c Merge branch 'fix/rename-ampq-to-amqp' of git://github.com/opendatatrentino/ckanext-harvest into opendatatrentino-fix/rename-ampq-to-amqp 2013-10-04 17:24:53 +01:00
amercader c5f4d6889b Merge branch 'harvest-object-create' 2013-10-04 17:22:58 +01:00
Stefan Oderbolz d50eb6fca8 Harvesting of remote organisations similar to remote groups 2013-10-04 16:37:52 +02:00
joetsoi da2fd45e80 [#65] make harvest_job_exists validator return model object
return the model in the validator instead of checking that it exists in
the validator, returning the id and then fetching it again in the action
function
2013-10-03 15:51:37 +01:00
Samuele Santi 611b9aab6d Fixed typo: ampq -> amqp 2013-09-19 11:43:03 +02:00
joetsoi 9b3199b41b [#65] remove unused code 2013-09-17 17:02:38 +01:00
joetsoi 5da153c6b6 [#65] harvest_object_create action
update to use schema and validators. Also accept more parameters to
data_dict.
2013-09-17 16:49:19 +01:00
John Martin 71aedf3fd4 Makes ede45bd work in both CKAN 2.0 and 2.1 2013-09-05 15:36:00 +01:00
John Martin ede45bd1be Fixes #66 by correcting the block name 2013-09-05 15:31:44 +01:00
joetsoi 1b663bbff4 add harvest_object_create action 2013-09-04 14:17:01 +01:00
amercader 52956feab9 Merge branch '62-default-package-name' 2013-08-19 18:23:37 +01:00
amercader f51b8e905a [#58] Check properly for version numbers (patch numbers) 2013-08-19 18:13:01 +01:00
Vitor Baptista f028375ad3 [#62] Use current name when updating package, if the user haven't sent a new one
It's hard for someone outside CKAN to make sure they're sending it in the format
we expect. And they'll also have to keep track of our name format, to keep in
sync whenever we change.

To fix this, we simply do what we already do when creating packages: use a
default name. In this case, the current one.
2013-08-18 12:08:30 -03:00
John Martin 86dcd933ea Merged master 2013-08-15 18:47:16 +01:00
John Martin 712e150b52 [#58] Fix to make merge nice 2013-08-15 18:43:46 +01:00
John Martin 575df637b4 [#58] Fixes to make harvest templates to work with both CKAN 2.0 and 2.1 2013-08-15 16:45:02 +01:00
amercader 05e6362c38 Merge branch 'fix-jinja-status-exception' of git://github.com/metaodi/ckanext-harvest into metaodi-fix-jinja-status-exception 2013-08-15 14:39:20 +01:00
amercader 01ca5c0dfd [#61] Ignore harvest sources on the CKAN harvester 2013-08-15 14:38:33 +01:00
amercader b25fffda93 [#36] Fix bug on API version checking 2013-08-15 14:37:55 +01:00
amercader 39ad78d90a [#59] Ignore auth in the CKAN harvester 2013-08-15 14:37:12 +01:00
Stefan Oderbolz f26baf6c09 Hide both the label and the number of datasets when 'status' is not available 2013-08-15 13:25:16 +02:00
amercader 1c36b33aaf [#59] Ignore auth when using site_user 2013-08-14 12:28:27 +01:00
amercader ffea49ca62 [#56] Update parameters on source create command
Add missing title and owner_org fields, remove deprecated user_id and
publisher_id
2013-08-14 11:54:51 +01:00
amercader 3494727d3f [#56] Increase max params number 2013-08-14 11:43:32 +01:00
amercader 8e33262026 [#56] Fix syntax error and wrong type 2013-08-14 11:31:23 +01:00
Stefan Oderbolz 4dfd091aec Make the /harvest page more robust if source.status is not set
This prevents exceptions from appearing in the log from Jinja:
  [error] [client 1.2.3.4] Error - <class 'jinja2.exceptions.UndefinedError'>: 'dict object' has no attribute 'status'
2013-08-14 11:52:11 +02:00
Stefan Oderbolz 7ae9d6e208 Made print method more robust against KeyErrors
This is especially needed if you create a new harvest source which does not have all the optional arguments. Before this lead to a KeyError after the creation of the source. Now this simply output 'None'.
2013-08-05 23:50:30 +02:00
Stefan Oderbolz 1249564be5 Add additional name argument when creating new harvest source 2013-08-05 23:46:21 +02:00
Stefan Oderbolz ade5f83e38 Change key of data_dict from 'type' to 'source_type' 2013-08-05 23:07:25 +02:00
amercader cb745c3c3e Avoid importing unnecessary functions from the harvest logic 2013-08-05 18:39:44 +01:00
Vitor Baptista 70e53a7833 Fix bug where source was being treated as an object, when it's a dict 2013-07-29 07:06:58 -03:00
amercader cc3f3d3426 [#50] Fix objects deletion on gather exceptions 2013-07-05 13:29:11 +01:00
amercader e2696b98bb [#50] Save all dates as UTC in the database
At some point we may want to transform these to local time at the
dictization level. We will need a library like dateutil to handle it
properly though.
2013-07-04 14:59:27 +01:00
kindly c2283e3fdb only migrate harvest sources which are active 2013-06-28 02:32:45 +01:00
kindly a42991b8c9 fix so that non sysadmins can edit harvest sources of organizations they
are admins or editors of.
2013-06-27 12:16:11 +01:00
kindly 6540726c47 use correct limit for paging harvest listing 2013-06-26 11:14:38 +01:00
amercader 584c340583 Merge branch '42-remove-non-string-extras' 2013-06-03 10:33:59 +01:00
Sean Hammond 01df3a1db4 [#42] Dump non-string extras with json
Convert any non-string extra values to strings using json.dumps(),
instead of just deleting them.
2013-05-31 20:35:06 +02:00
amercader 3a31db59b6 [#36] Move validation code to validate_config
This ensures it is checked whenever the source is edited or created.
2013-05-31 17:23:40 +01:00
amercader a6a0196a4e Merge branch 'api-version-fix' of git://github.com/fraunhoferfokus/ckanext-harvest into fraunhoferfokus-api-version-fix 2013-05-31 17:15:43 +01:00
Sean Hammond 85a013f2c9 [#42] Remove non-string extras from packages
Remove extras whose values are not strings (e.g. dicts, lists..) from
packages before attempting to create or update the packages on the
target site.

In CKAN 1 it was possible for the values of extras to be other types,
but in CKAN 2 they must be strings, so when harvesting from a CKAN 1 site
into a CKAN 2 site SQLAlchemy would crash when trying to create packages
with non-string extras.

The fix in this commit is to simply remove any non-string extras from
the harvested package. (Alternatively, we could try to convert them to a
string using JSON.)

Fixes #42.
2013-05-31 15:43:42 +02:00
amercader 361abcfc07 [#17] Fix bug with remote groups handling
If neither 'only_local' or 'create' are used the remote groups property
needs to be removed, otherwise it causes an exception when the group is
not found.
2013-05-30 18:06:15 +01:00
Konrad Reiche 87cae31c75 Fix api_version check in the group importer code
I have forgotten to update one check for the api_version 1 in the code
responsible for the remote group import feature. This commit fixes that.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:36:56 +02:00
Konrad Reiche c858b9fe9f Add exception handling for the API version parsing
I have added try-except clauses in order to prevent the process from
crashing if a non-parsable integer is used for the api_version option.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:12:05 +02:00
Konrad Reiche 05094090af Change type of the API version to integer
The CKAN logic uses integers when dealing with the API version, e.g.
making checks which API version is in use. Currently, the harvester
uses strings to identify the API version. Instead of dealing with
type conversion the harvester could use integers directly.

This commit fixes okfn/ckanext-harvest#36. When the API version is
parsed from the configuration it is passed through the int() function.
This way the harvesting will still work even if a harvest source was
configured with a string API version which makes this commit backward
compatible.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 12:51:48 +02:00
amercader ff7287d4b4 [#30] Remove lxml dependency 2013-05-24 18:12:02 +01:00
amercader 3d2867ca04 [#17] Remove ckanclient dependency as it is not used 2013-05-24 17:55:37 +01:00
amercader f1d11c1307 [#17] Import remote groups in CKAN harvester
This is a cleaner commit of the great work done by @platzhirsch
implementing remote groups import on the CKAN harvester.
2013-05-24 16:55:05 +01:00
amercader 1792180e4f Better harvest source dataset migration
Current implementation only checked for the first source to exist and
didn't allow to rerun the migration for other sources if there was an
error. With the new one, all non existing sources are migrated each
time.
2013-05-24 14:49:55 +01:00
amercader 1d54edfdaa Fix bug in source datasets migration
Wrong dataset type was causing the default package schema to be used,
thus failing when providing an id.
2013-05-24 14:25:05 +01:00
amercader 751409ab7d [#34] Integrate clear command with delete source
When deleting a source, if clear_source equals true in the context,
harvest_source_clear will be called. Default is false. The UI shows a
select with the two options.
2013-05-20 14:30:22 +01:00
amercader 6d5d0fbaae Add hover helper text to refresh and clear buttons 2013-05-20 12:09:14 +01:00
Tom Rees edfc49719b Use page_heading helper consistently with the main CKAN templates. 2013-05-17 16:12:57 +01:00
amercader d0bc52f2d8 [#34] Fix typo in warning message 2013-05-16 18:07:32 +01:00
amercader b9e2613458 [#34] Allow all authorized users for a source to clear it 2013-05-16 17:57:59 +01:00
amercader 71349e658b [#34] Expose harvest source clear button 2013-05-16 17:51:48 +01:00
amercader 7b652542e7 [#34] Fix harvest_source_clear action
Some typos in the SQL statements, and also the source needs to be
reindexed to update the status with the counts.
2013-05-16 17:33:39 +01:00
amercader 1efd7ab4cd Ignore remote orgs in CKAN harvester
If #17 progresses we can do somethign similar for them, although it amy
be more complicated because of authorization issues.
2013-05-16 17:30:54 +01:00
kindly 1714e55110 simplify harvest_clear queries so they do not lock on big db 2013-04-30 13:59:23 +01:00
kindly a2b8ab1994 make harvest source clear not create table 2013-04-30 12:40:46 +01:00
amercader 9041f3f3ad Changes in Redis conusmer to make tests work 2013-04-22 18:08:19 +01:00
amercader 70dfee1a36 Update queue tests 2013-04-22 17:56:11 +01:00
kindly dcfd201cdd [#32] redis queue support 2013-04-21 17:04:57 +01:00
kindly 0ce59a29b6 delete insead of update harvest objects when error 2013-04-12 12:32:33 +01:00
kindly 7d7657f94a make gather phase as finished if there is an error 2013-04-12 10:35:08 +01:00
kindly bd761498f0 make sure config dict is not jsonified if it contains an error 2013-04-08 18:52:36 +01:00
amercader eaebeb4e6e Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2013-04-08 13:25:33 +01:00
amercader 5414b6c08d Merge branch '29-new-idataset-form' into release-v2.0 2013-04-08 13:23:41 +01:00
joetsoi 66ff773f99 remove previous commit import 2013-03-29 12:47:14 +00:00
joetsoi 3ac065f0f0 fix package_schema import 2013-03-29 01:17:24 +00:00
joetsoi cb8b808274 sanity check that harvest source id matches harvest dataset id
remove author_email, license_id, maintainer_email, maintainer and
 author from pacakge_dict, these were not actually necessary
2013-03-29 00:59:20 +00:00
amercader 99bd17401c Handle wrong JSON in harvest_source_extra_validator 2013-03-28 16:19:16 +00:00
kindly a9b8be8f01 harvest source index clear 2013-03-28 15:36:44 +00:00
amercader 95ebb5bbf3 [#29] Remove check_data_dict 2013-03-28 15:01:21 +00:00
amercader fbc8ecde97 [#29] Fix some imports on actions and plugin 2013-03-28 15:00:44 +00:00
kindly c754479014 #29 make new idatasets form work with harvest source form 2013-03-25 17:38:07 +00:00
joetsoi 548d3c1c2a fix validation issue on db upgrade 2013-03-25 12:02:07 +00:00
kindly b5a697ec87 Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0 2013-03-25 11:58:31 +00:00
kindly 0b5c3c608a catch and raise gather exception, acking the message 2013-03-25 11:57:57 +00:00
amercader 438ba672e2 Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2013-03-25 11:44:37 +00:00
kindly 845c9927a8 add harvest source clear 2013-03-25 11:39:00 +00:00
joetsoi d518b6709a [#27] fix package_list_for_source for unowned data sources 2013-03-21 15:59:22 +00:00
amercader 7bff041568 [#25] Further tweaks on helpers texts 2013-03-21 13:47:23 +00:00
John Martin 4d0dd9a4d3 [#25] Small copy tweak to confirmation dialog 2013-03-21 12:14:33 +00:00
John Martin 78bde974b9 [#25] Adds confirmation dialog to reharvest button 2013-03-21 10:56:39 +00:00
John Martin 3197162fe6 [#25] Changed 'Refresh' to 'Reharvest' on button 2013-03-21 10:36:12 +00:00
kindly a7583a7b8b Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0 2013-03-21 02:32:11 +00:00
kindly b676fb02e1 only get out harvest items in interface and when indexing 2013-03-21 02:31:34 +00:00
amercader 15c44d9aa8 Merge branch '23-harvest-form-cleanup' into release-v2.0 2013-03-20 17:03:41 +00:00
John Martin 4ba298fe58 [#23] Make labels a little wider on harvest new form 2013-03-20 14:07:03 +00:00
amercader 02e90767f4 Fix source listing in organization page
It needed update after #515 in ckan core
2013-03-20 13:01:23 +00:00
John Martin 86355fb9db [#23] Form cleanup after core bootstrap upgrade 2013-03-20 10:44:24 +00:00
kindly 634a0bbd30 return instead of continue 2013-03-19 01:21:20 +00:00
kindly 3adf38105e readd code from old branch seperating the fetch and import logic 2013-03-19 01:16:43 +00:00
amercader c2a6bd14eb Add auth function for harvest_source_show_status 2013-03-18 16:48:27 +00:00
amercader c76b7d95f3 Only count public datasets on the source status
This is more in line with what is done on the orgs/groups pages
2013-03-18 16:41:01 +00:00
amercader cb80ac784e Add logic to show private datasets to authorizaded users 2013-03-18 16:29:29 +00:00
amercader 341331ac53 Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2013-03-14 17:33:21 +00:00
amercader d77f16aba9 [#21] Improve gather stage error handling
See issue for full details. Basically we don't want to catch any
exception at the queue.py level, as they prevent debugging. Harvesters
should deal with them and return a list of ids or an empty list if no
objects need to be fetched.
Also improved the debug messages.
2013-03-14 17:31:07 +00:00
John Martin b30cc54427 Fix for add harvest source button within org 2013-03-14 14:45:54 +00:00
amercader 91f18bffab Fix pagination on org sources listing 2013-03-14 11:44:38 +00:00
amercader 8cac0977aa Fix import on org sources listing 2013-03-14 11:44:22 +00:00
amercader cd6c1b56a8 [#18] Get package dict on after_delete to check type
No need for #615 in core then
2013-03-13 17:31:39 +00:00
amercader 1b11b00946 [#18] Fix wrong logic for setting the source active field 2013-03-13 13:19:43 +00:00
kindly cb5e06119e Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0 2013-03-12 23:31:58 +00:00
kindly 06355ee6c4 Make IFacets work for harvest source related searches 2013-03-12 23:31:06 +00:00
amercader fab5b81c2c Pass context to functions handling harvest sources 2013-03-12 17:30:31 +00:00
amercader 5e50a5c9ad [#8] Update how state is handled for source objects 2013-03-12 15:35:49 +00:00
amercader 2ee3f33f51 [#18] Allow reactivation of sources
Due to #607 in CKAN core, once a source was deleted you could not
reactivate it again. As a workaround, if the source is deleted the
Delete button is not shown and the state select is, so you can set it to
'active'.
Also fixed wrong redirect after deletion.
2013-03-12 14:06:54 +00:00
amercader 23d1d5742c [#18] Update delete harvest source functionality
The harvest_source_delete logic function proxies to package delete,
which will delete the harvest source dataset. The harvest plugin then
hooks to the after_delete extension point in order to inactivate the
actual HarvestSource object and abort any pending jobs.
Also added the Delete button to the harvest source form.
2013-03-12 13:14:07 +00:00
amercader c957fdf17c Merge branch '14-template-tweaks' into release-v2.0 2013-03-08 14:49:43 +00:00
amercader ecceff48ed [#14] Use source.organization again after fix in 949bb6f 2013-03-08 14:48:49 +00:00
amercader 949bb6fe6a [#16] Add organization to source dict 2013-03-08 14:47:11 +00:00
John Martin f25ef19985 [#14] Fix for org breadcrumbs on sources 2013-03-08 12:48:11 +00:00
John Martin 2a53e4a2e4 [#14] Couple of minor template tweaks 2013-03-08 12:38:41 +00:00
joetsoi 7257258ca4 mark new harvest objects as current
When a new harvest_object for a new package was being created, it
was immediately being marked as false, as all objects were marked
as false, including the new object just created and newly marked
as current=true.

Fix so that old HarvestObjects are only marked as current=False
when updating an existing package.
2013-03-07 20:27:27 +00:00
John Martin 14e51ec587 Fix for removed snippet from ckan core 2013-03-07 11:52:59 +00:00
amercader 2ee27164c3 [#13] Remove or deprecate unused code
Mostly in controllers, dictization and plugin, either related to the old
templates pre-dataset type or old authorization.
2013-03-06 16:54:33 +00:00
amercader 6c02c87f8d [#13] Set routes to /harvest
Mostly painless as we (most of the time) were using DATASET_TYPE_NAME.
All old routes now point to the correct place in the new interface.
2013-03-06 16:33:46 +00:00
amercader eda280f266 Merge branch '12-org-source-listing' into 2.0-dataset-sources 2013-03-06 15:45:45 +00:00
amercader 889325dd9c [#12] Clean up and rename organization controller 2013-03-06 15:43:10 +00:00
amercader e9adaa7f91 [#12] Change URL for org sources list
Use "/organization/harvest_source/{id}", which will turn into
"/organization/harvest/{id}" soon
2013-03-06 15:38:38 +00:00
amercader 74633d0803 Fix error count in job stats
We want to take into account objects with errors that where created or
updated anyway (eg bbox errors), so we bascially query for the number of
objects that have object errors.

Also add the number of gather errors to this count.
2013-03-06 13:44:04 +00:00
amercader ef2defbcf9 [#7] Refactor job report page to include all errors 2013-03-06 13:43:40 +00:00
amercader bec31a611e Fix empty job finished date 2013-03-06 13:42:35 +00:00
amercader 04710fd1c6 Revert removal of filter in job list action in 7544d5c 2013-03-06 12:19:20 +00:00
John Martin c2b552b980 [#12] Better faceting for specifically harvest sources 2013-03-06 11:38:24 +00:00
John Martin 246898049e [#12] When harvest source listing is within org links goto edit pages 2013-03-06 11:36:24 +00:00
John Martin 9d149e4e5d [#12] Makes a harvest source admin page within org look a little nicer 2013-03-06 11:23:36 +00:00
kindly ca2df234d2 [#12] begin work on org harvest source controller 2013-03-06 04:11:31 +00:00
kindly 23aa45cc71 Merge branch '2.0-dataset-sources' into source_extra_config_validation 2013-03-06 01:10:48 +00:00
amercader d9a71f7c59 [#7] Fix wrong finish date on job listing 2013-03-05 18:56:30 +00:00
John Martin e566c96d62 [#7] Adds new harvest source button 2013-03-05 16:06:04 +00:00
John Martin 7544d5c5ef [#7] Removed faceted navigation for uneeded toggles in job reports 2013-03-05 15:23:42 +00:00
joetsoi e64c8ead0f fix print gather_errors 2013-03-05 12:49:20 +00:00
amercader 574c69fa9c Merge branch '2.0-dataset-sources' into 7-harvest-source-templates 2013-03-01 17:55:16 +00:00
amercader 182fbf054a Add XML declaration to contents if not present 2013-03-01 17:25:35 +00:00
amercader 5c17a525c1 Refresh session after each harvest stage
Otherwise the eg the source config got cached and you needed to restart
the consumers to refresh it.
2013-03-01 12:55:59 +00:00
amercader bd128ab58b Refresh session after each harvest stage
Otherwise the eg the source config got cached and you needed to restart
the consumers to refresh it.
2013-03-01 12:52:58 +00:00
amercader 3b6468b181 Merge branch '2.0-dataset-sources' of github.com:okfn/ckanext-harvest into 2.0-dataset-sources 2013-03-01 12:51:17 +00:00
joetsoi 9432368bea fix gather_stage if there is a previous job
change check on gather stage to check for changed packages since
last job instead of current harvest job's gather_start

fix attribute look up bug

fix print_job to print 0 gather_errors instead of key error
2013-02-28 19:06:21 +00:00
joetsoi ffce2c7915 Merge branch '2.0-dataset-sources' of github.com:okfn/ckanext-harvest into 2.0-dataset-sources 2013-02-28 18:11:12 +00:00
amercader 217d58d3a4 Merge branch 'source_extra_config_validation' of github.com:okfn/ckanext-harvest into source_extra_config_validation 2013-02-28 16:03:27 +00:00
amercader f28dc97f79 Fix bug in harvest job reports 2013-02-28 15:47:56 +00:00
amercader dab98112dc Fix bug in harvest job reports 2013-02-28 15:47:35 +00:00
kindly 871576f89c Merge remote-tracking branch 'remotes/origin/source_extra_config_validation' into source_extra_config_validation 2013-02-28 13:48:58 +00:00
kindly 9cef777e7b make sure config is also on top level 2013-02-28 13:46:16 +00:00
amercader e82410724a Merge branch '7-harvest-source-templates' into source_extra_config_validation 2013-02-28 12:18:09 +00:00
amercader f7cba69fe6 Merge branch '2.0-dataset-sources' into 7-harvest-source-templates 2013-02-28 12:17:47 +00:00
amercader a86d91c3f0 [#11] Make get actions side_effect_free 2013-02-28 12:17:15 +00:00
amercader fe6952ed00 Merge branch '7-harvest-source-templates' into source_extra_config_validation 2013-02-27 15:45:33 +00:00
joetsoi ba486a9482 add indexing of datasets whilst harvesting 2013-02-27 11:34:09 +00:00
John Martin d1b2b158b2 [#7] Harvest listing page and HTML/CSS cleanup
* I'm happy with /harvest_source/ now
* Also I've removed a load of undeeded CSS that wasn't really needed
* Also templates are now using core styles instead of custom ones
2013-02-27 11:14:04 +00:00
kindly e0a3eb7899 add javascript for source type 2013-02-25 18:12:47 +00:00
kindly 5b50126670 source extras field type 2013-02-25 18:07:34 +00:00
amercader efe977512b Include gather errors on job summaries and reports 2013-02-25 17:17:08 +00:00
amercader d1b71308af [#7] Minor tweaks in job pages 2013-02-25 16:15:37 +00:00
amercader c7bb897cdd [#7] Inactivate Refresh button if a new job alredy exists 2013-02-25 15:33:29 +00:00
amercader 57b3739dd4 [#7] Return most recent job on source status, not just finished 2013-02-25 15:32:39 +00:00
amercader 60f9360e84 [#7] Don't show job snippet in dashboard if no jobs 2013-02-25 13:11:08 +00:00
amercader 93e15dc529 [#7] Restrict access to source admin page 2013-02-25 13:10:30 +00:00
amercader 457b8d5988 [#7] 404 on last job if no jobs yet 2013-02-25 12:49:14 +00:00
amercader 34ae6be689 [#7] Fix dataset count on source page 2013-02-25 12:19:09 +00:00
amercader b3819e8df4 [#7] Use dict instead of domain object in templates 2013-02-25 12:18:30 +00:00
amercader 49a1c467cf Merge branch '7-harvest-source-templates' of github.com:okfn/ckanext-harvest into 7-harvest-source-templates 2013-02-25 12:04:34 +00:00
amercader e1d73c82f0 [#7] Make new routes more custom
In case we change the root name
2013-02-25 12:03:34 +00:00
kindly ebe246fe99 make report emit added so shows up on front end 2013-02-22 17:32:33 +00:00
amercader 57d6b3de74 [#7] Fix auth check on new source form
Auth check failed because source was undefined
2013-02-22 17:32:05 +00:00
kindly 52c0a5cbd6 Merge branch '2.0-dataset-sources' into 7-harvest-source-templates 2013-02-22 17:26:34 +00:00