Commit Graph

53 Commits

Author SHA1 Message Date
Stefan Oderbolz c52085006a [#61] Truly ignore harvest sources
The currently implementation returns False when a harvest source is being harvested. This leads to an error on the harvesting job, which in turn tends to confuse users that have no idea of this special implementation. This fix ensures that harvest sources are still ignored, but silently.
2013-10-23 07:40:55 +02:00
amercader c18d9dc3af [#71] CKAN harvester: Add datasets to source organization
If the harvest source belongs to an organization, new datasets should be added
to it. This is already the case in the spatial harvesters.

The remote orgs logic has been kept, with the difference that if for
some reason the remote org can not be assigned, the local one is used.

If the source does not have an organization, none is added.
2013-10-22 16:24:43 +01:00
amercader bd62b62764 Merge branch 'metaodi-add-harvesting-of-organizations' 2013-10-15 17:50:04 +01:00
Stefan Oderbolz 8b5d70c6fe Only try to create/match a organization if there is a remote_org 2013-10-11 18:08:32 +02:00
amercader 0f5624822c Use remote name if present when creating datasets on CKAN harvester 2013-10-11 16:50:25 +01:00
Stefan Oderbolz dd1acd0c6b Use remote_orgs for organizations 2013-10-07 11:22:19 +02:00
Stefan Oderbolz d50eb6fca8 Harvesting of remote organisations similar to remote groups 2013-10-04 16:37:52 +02:00
Vitor Baptista f028375ad3 [#62] Use current name when updating package, if the user haven't sent a new one
It's hard for someone outside CKAN to make sure they're sending it in the format
we expect. And they'll also have to keep track of our name format, to keep in
sync whenever we change.

To fix this, we simply do what we already do when creating packages: use a
default name. In this case, the current one.
2013-08-18 12:08:30 -03:00
amercader 01ca5c0dfd [#61] Ignore harvest sources on the CKAN harvester 2013-08-15 14:38:33 +01:00
amercader b25fffda93 [#36] Fix bug on API version checking 2013-08-15 14:37:55 +01:00
amercader 39ad78d90a [#59] Ignore auth in the CKAN harvester 2013-08-15 14:37:12 +01:00
amercader 584c340583 Merge branch '42-remove-non-string-extras' 2013-06-03 10:33:59 +01:00
Sean Hammond 01df3a1db4 [#42] Dump non-string extras with json
Convert any non-string extra values to strings using json.dumps(),
instead of just deleting them.
2013-05-31 20:35:06 +02:00
amercader 3a31db59b6 [#36] Move validation code to validate_config
This ensures it is checked whenever the source is edited or created.
2013-05-31 17:23:40 +01:00
amercader a6a0196a4e Merge branch 'api-version-fix' of git://github.com/fraunhoferfokus/ckanext-harvest into fraunhoferfokus-api-version-fix 2013-05-31 17:15:43 +01:00
Sean Hammond 85a013f2c9 [#42] Remove non-string extras from packages
Remove extras whose values are not strings (e.g. dicts, lists..) from
packages before attempting to create or update the packages on the
target site.

In CKAN 1 it was possible for the values of extras to be other types,
but in CKAN 2 they must be strings, so when harvesting from a CKAN 1 site
into a CKAN 2 site SQLAlchemy would crash when trying to create packages
with non-string extras.

The fix in this commit is to simply remove any non-string extras from
the harvested package. (Alternatively, we could try to convert them to a
string using JSON.)

Fixes #42.
2013-05-31 15:43:42 +02:00
amercader 361abcfc07 [#17] Fix bug with remote groups handling
If neither 'only_local' or 'create' are used the remote groups property
needs to be removed, otherwise it causes an exception when the group is
not found.
2013-05-30 18:06:15 +01:00
Konrad Reiche 87cae31c75 Fix api_version check in the group importer code
I have forgotten to update one check for the api_version 1 in the code
responsible for the remote group import feature. This commit fixes that.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:36:56 +02:00
Konrad Reiche c858b9fe9f Add exception handling for the API version parsing
I have added try-except clauses in order to prevent the process from
crashing if a non-parsable integer is used for the api_version option.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:12:05 +02:00
Konrad Reiche 05094090af Change type of the API version to integer
The CKAN logic uses integers when dealing with the API version, e.g.
making checks which API version is in use. Currently, the harvester
uses strings to identify the API version. Instead of dealing with
type conversion the harvester could use integers directly.

This commit fixes okfn/ckanext-harvest#36. When the API version is
parsed from the configuration it is passed through the int() function.
This way the harvesting will still work even if a harvest source was
configured with a string API version which makes this commit backward
compatible.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 12:51:48 +02:00
amercader 3d2867ca04 [#17] Remove ckanclient dependency as it is not used 2013-05-24 17:55:37 +01:00
amercader f1d11c1307 [#17] Import remote groups in CKAN harvester
This is a cleaner commit of the great work done by @platzhirsch
implementing remote groups import on the CKAN harvester.
2013-05-24 16:55:05 +01:00
amercader 1efd7ab4cd Ignore remote orgs in CKAN harvester
If #17 progresses we can do somethign similar for them, although it amy
be more complicated because of authorization issues.
2013-05-16 17:30:54 +01:00
kindly c754479014 #29 make new idatasets form work with harvest source form 2013-03-25 17:38:07 +00:00
joetsoi 7257258ca4 mark new harvest objects as current
When a new harvest_object for a new package was being created, it
was immediately being marked as false, as all objects were marked
as false, including the new object just created and newly marked
as current=true.

Fix so that old HarvestObjects are only marked as current=False
when updating an existing package.
2013-03-07 20:27:27 +00:00
joetsoi 9432368bea fix gather_stage if there is a previous job
change check on gather stage to check for changed packages since
last job instead of current harvest job's gather_start

fix attribute look up bug

fix print_job to print 0 gather_errors instead of key error
2013-02-28 19:06:21 +00:00
joetsoi ba486a9482 add indexing of datasets whilst harvesting 2013-02-27 11:34:09 +00:00
joetsoi f97e3b4c6c add return True to import stage of ckanharvester
Was causing queue.py to report that the import had errored.
2013-02-22 10:13:36 +00:00
amercader 177349fd76 Update HarvesterBase
This is a convenience class that other harvesters can extend. Updates
include a cleanup of old functions and porting of enhancements from the
spatial harvesters.
2013-02-12 16:10:13 +00:00
amercader 871eae94b6 [ckan harvester] Fix bug on force all check 2012-03-15 11:31:12 +00:00
amercader f210455aef [ckan harvester] Replace title on default extras 2012-03-13 12:38:14 +00:00
amercader e0bef2ef9c [base] Minor fix for harvesters without config 2012-03-12 14:46:28 +00:00
amercader 50537a6738 Merge branch 'master' into enh-1726-harvesting-model-update 2012-02-15 12:01:15 +00:00
amercader 9ed152cbea [ckan harvester] Add support for forcing gathering of all remote packages 2012-02-03 17:54:34 +00:00
amercader 479750da09 [#1726][base harvester] Set current field when importing 2012-02-02 13:18:43 +00:00
amercader eb646b3385 [ckan harvester] Add support for defining default extras 2012-01-10 17:07:19 +00:00
amercader ae51093213 [ckan harvester] Ignore __junk field, was causing imports to fail 2012-01-10 14:46:12 +00:00
Adrià Mercader da469ab08e [base harvester] Custom tag munge function. TODO: check with flexible tags 2011-11-23 11:05:52 +00:00
Adrià Mercader cfaba6e1e8 [ckan harvester] Add support for sending an API key 2011-11-21 17:29:10 +00:00
Adrià Mercader 0ab5c53b47 [ckan harvester] Fix typo 2011-11-18 17:53:01 +00:00
Adrià Mercader 994590531e [ckan harvester] Support for creating read-only packages 2011-11-18 14:30:10 +00:00
Adrià Mercader c939d90dbb [ckan harvester] Support for defining a custom user to do the harvesting 2011-11-18 14:12:30 +00:00
Adrià Mercader 2018d9e513 [ckan harvester] Support for default tags and groups 2011-11-18 13:20:41 +00:00
Adrià Mercader c04d80e27e Use get_action function instead of directly calling the action functions 2011-10-26 17:26:18 +01:00
Adrià Mercader 7927329536 Make harvesters work with latest ckan release 2011-07-29 11:31:03 +01:00
Adrià Mercader cabbb4922d Use API version defined in config if present 2011-07-18 17:35:32 +01:00
Adrià Mercader c867660e7d Add docs to base harvester functions 2011-07-18 17:35:03 +01:00
Adrià Mercader 54de6759fe Fix bug with empty config 2011-06-28 15:04:40 +01:00
Adrià Mercader c80e68a12f Ensure the correct configuration is used on each stage 2011-06-14 15:59:13 +01:00
Adrià Mercader 3125bb1514 Add a check to ensure sources with no packages are reharvested 2011-06-14 12:59:48 +01:00