Commit Graph

810 Commits

Author SHA1 Message Date
John Martin 86dcd933ea Merged master 2013-08-15 18:47:16 +01:00
John Martin 712e150b52 [#58] Fix to make merge nice 2013-08-15 18:43:46 +01:00
John Martin 575df637b4 [#58] Fixes to make harvest templates to work with both CKAN 2.0 and 2.1 2013-08-15 16:45:02 +01:00
amercader 05e6362c38 Merge branch 'fix-jinja-status-exception' of git://github.com/metaodi/ckanext-harvest into metaodi-fix-jinja-status-exception 2013-08-15 14:39:20 +01:00
amercader 01ca5c0dfd [#61] Ignore harvest sources on the CKAN harvester 2013-08-15 14:38:33 +01:00
amercader b25fffda93 [#36] Fix bug on API version checking 2013-08-15 14:37:55 +01:00
amercader 39ad78d90a [#59] Ignore auth in the CKAN harvester 2013-08-15 14:37:12 +01:00
Stefan Oderbolz f26baf6c09 Hide both the label and the number of datasets when 'status' is not available 2013-08-15 13:25:16 +02:00
amercader 1c36b33aaf [#59] Ignore auth when using site_user 2013-08-14 12:28:27 +01:00
amercader ffea49ca62 [#56] Update parameters on source create command
Add missing title and owner_org fields, remove deprecated user_id and
publisher_id
2013-08-14 11:54:51 +01:00
amercader 3494727d3f [#56] Increase max params number 2013-08-14 11:43:32 +01:00
amercader 8e33262026 [#56] Fix syntax error and wrong type 2013-08-14 11:31:23 +01:00
Stefan Oderbolz 4dfd091aec Make the /harvest page more robust if source.status is not set
This prevents exceptions from appearing in the log from Jinja:
  [error] [client 1.2.3.4] Error - <class 'jinja2.exceptions.UndefinedError'>: 'dict object' has no attribute 'status'
2013-08-14 11:52:11 +02:00
Stefan Oderbolz 7ae9d6e208 Made print method more robust against KeyErrors
This is especially needed if you create a new harvest source which does not have all the optional arguments. Before this lead to a KeyError after the creation of the source. Now this simply output 'None'.
2013-08-05 23:50:30 +02:00
Stefan Oderbolz 1249564be5 Add additional name argument when creating new harvest source 2013-08-05 23:46:21 +02:00
Stefan Oderbolz ade5f83e38 Change key of data_dict from 'type' to 'source_type' 2013-08-05 23:07:25 +02:00
amercader cb745c3c3e Avoid importing unnecessary functions from the harvest logic 2013-08-05 18:39:44 +01:00
Vitor Baptista 70e53a7833 Fix bug where source was being treated as an object, when it's a dict 2013-07-29 07:06:58 -03:00
amercader cc3f3d3426 [#50] Fix objects deletion on gather exceptions 2013-07-05 13:29:11 +01:00
amercader e2696b98bb [#50] Save all dates as UTC in the database
At some point we may want to transform these to local time at the
dictization level. We will need a library like dateutil to handle it
properly though.
2013-07-04 14:59:27 +01:00
kindly c2283e3fdb only migrate harvest sources which are active 2013-06-28 02:32:45 +01:00
kindly a42991b8c9 fix so that non sysadmins can edit harvest sources of organizations they
are admins or editors of.
2013-06-27 12:16:11 +01:00
kindly 6540726c47 use correct limit for paging harvest listing 2013-06-26 11:14:38 +01:00
amercader 584c340583 Merge branch '42-remove-non-string-extras' 2013-06-03 10:33:59 +01:00
Sean Hammond 01df3a1db4 [#42] Dump non-string extras with json
Convert any non-string extra values to strings using json.dumps(),
instead of just deleting them.
2013-05-31 20:35:06 +02:00
amercader 3a31db59b6 [#36] Move validation code to validate_config
This ensures it is checked whenever the source is edited or created.
2013-05-31 17:23:40 +01:00
amercader a6a0196a4e Merge branch 'api-version-fix' of git://github.com/fraunhoferfokus/ckanext-harvest into fraunhoferfokus-api-version-fix 2013-05-31 17:15:43 +01:00
Sean Hammond 85a013f2c9 [#42] Remove non-string extras from packages
Remove extras whose values are not strings (e.g. dicts, lists..) from
packages before attempting to create or update the packages on the
target site.

In CKAN 1 it was possible for the values of extras to be other types,
but in CKAN 2 they must be strings, so when harvesting from a CKAN 1 site
into a CKAN 2 site SQLAlchemy would crash when trying to create packages
with non-string extras.

The fix in this commit is to simply remove any non-string extras from
the harvested package. (Alternatively, we could try to convert them to a
string using JSON.)

Fixes #42.
2013-05-31 15:43:42 +02:00
amercader 361abcfc07 [#17] Fix bug with remote groups handling
If neither 'only_local' or 'create' are used the remote groups property
needs to be removed, otherwise it causes an exception when the group is
not found.
2013-05-30 18:06:15 +01:00
Konrad Reiche 87cae31c75 Fix api_version check in the group importer code
I have forgotten to update one check for the api_version 1 in the code
responsible for the remote group import feature. This commit fixes that.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:36:56 +02:00
Konrad Reiche c858b9fe9f Add exception handling for the API version parsing
I have added try-except clauses in order to prevent the process from
crashing if a non-parsable integer is used for the api_version option.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:12:05 +02:00
Konrad Reiche 05094090af Change type of the API version to integer
The CKAN logic uses integers when dealing with the API version, e.g.
making checks which API version is in use. Currently, the harvester
uses strings to identify the API version. Instead of dealing with
type conversion the harvester could use integers directly.

This commit fixes okfn/ckanext-harvest#36. When the API version is
parsed from the configuration it is passed through the int() function.
This way the harvesting will still work even if a harvest source was
configured with a string API version which makes this commit backward
compatible.

Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 12:51:48 +02:00
amercader ff7287d4b4 [#30] Remove lxml dependency 2013-05-24 18:12:02 +01:00
amercader 3d2867ca04 [#17] Remove ckanclient dependency as it is not used 2013-05-24 17:55:37 +01:00
amercader f1d11c1307 [#17] Import remote groups in CKAN harvester
This is a cleaner commit of the great work done by @platzhirsch
implementing remote groups import on the CKAN harvester.
2013-05-24 16:55:05 +01:00
amercader 1792180e4f Better harvest source dataset migration
Current implementation only checked for the first source to exist and
didn't allow to rerun the migration for other sources if there was an
error. With the new one, all non existing sources are migrated each
time.
2013-05-24 14:49:55 +01:00
amercader 1d54edfdaa Fix bug in source datasets migration
Wrong dataset type was causing the default package schema to be used,
thus failing when providing an id.
2013-05-24 14:25:05 +01:00
amercader 751409ab7d [#34] Integrate clear command with delete source
When deleting a source, if clear_source equals true in the context,
harvest_source_clear will be called. Default is false. The UI shows a
select with the two options.
2013-05-20 14:30:22 +01:00
amercader 6d5d0fbaae Add hover helper text to refresh and clear buttons 2013-05-20 12:09:14 +01:00
Tom Rees edfc49719b Use page_heading helper consistently with the main CKAN templates. 2013-05-17 16:12:57 +01:00
amercader d0bc52f2d8 [#34] Fix typo in warning message 2013-05-16 18:07:32 +01:00
amercader b9e2613458 [#34] Allow all authorized users for a source to clear it 2013-05-16 17:57:59 +01:00
amercader 71349e658b [#34] Expose harvest source clear button 2013-05-16 17:51:48 +01:00
amercader 7b652542e7 [#34] Fix harvest_source_clear action
Some typos in the SQL statements, and also the source needs to be
reindexed to update the status with the counts.
2013-05-16 17:33:39 +01:00
amercader 1efd7ab4cd Ignore remote orgs in CKAN harvester
If #17 progresses we can do somethign similar for them, although it amy
be more complicated because of authorization issues.
2013-05-16 17:30:54 +01:00
kindly 1714e55110 simplify harvest_clear queries so they do not lock on big db 2013-04-30 13:59:23 +01:00
kindly a2b8ab1994 make harvest source clear not create table 2013-04-30 12:40:46 +01:00
amercader 9041f3f3ad Changes in Redis conusmer to make tests work 2013-04-22 18:08:19 +01:00
amercader 70dfee1a36 Update queue tests 2013-04-22 17:56:11 +01:00
kindly dcfd201cdd [#32] redis queue support 2013-04-21 17:04:57 +01:00
kindly 0ce59a29b6 delete insead of update harvest objects when error 2013-04-12 12:32:33 +01:00
kindly 7d7657f94a make gather phase as finished if there is an error 2013-04-12 10:35:08 +01:00
kindly bd761498f0 make sure config dict is not jsonified if it contains an error 2013-04-08 18:52:36 +01:00
amercader eaebeb4e6e Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2013-04-08 13:25:33 +01:00
amercader 5414b6c08d Merge branch '29-new-idataset-form' into release-v2.0 2013-04-08 13:23:41 +01:00
joetsoi 66ff773f99 remove previous commit import 2013-03-29 12:47:14 +00:00
joetsoi 3ac065f0f0 fix package_schema import 2013-03-29 01:17:24 +00:00
joetsoi cb8b808274 sanity check that harvest source id matches harvest dataset id
remove author_email, license_id, maintainer_email, maintainer and
 author from pacakge_dict, these were not actually necessary
2013-03-29 00:59:20 +00:00
amercader 99bd17401c Handle wrong JSON in harvest_source_extra_validator 2013-03-28 16:19:16 +00:00
kindly a9b8be8f01 harvest source index clear 2013-03-28 15:36:44 +00:00
amercader 95ebb5bbf3 [#29] Remove check_data_dict 2013-03-28 15:01:21 +00:00
amercader fbc8ecde97 [#29] Fix some imports on actions and plugin 2013-03-28 15:00:44 +00:00
kindly c754479014 #29 make new idatasets form work with harvest source form 2013-03-25 17:38:07 +00:00
joetsoi 548d3c1c2a fix validation issue on db upgrade 2013-03-25 12:02:07 +00:00
kindly b5a697ec87 Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0 2013-03-25 11:58:31 +00:00
kindly 0b5c3c608a catch and raise gather exception, acking the message 2013-03-25 11:57:57 +00:00
amercader 438ba672e2 Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2013-03-25 11:44:37 +00:00
kindly 845c9927a8 add harvest source clear 2013-03-25 11:39:00 +00:00
joetsoi d518b6709a [#27] fix package_list_for_source for unowned data sources 2013-03-21 15:59:22 +00:00
amercader 7bff041568 [#25] Further tweaks on helpers texts 2013-03-21 13:47:23 +00:00
John Martin 4d0dd9a4d3 [#25] Small copy tweak to confirmation dialog 2013-03-21 12:14:33 +00:00
John Martin 78bde974b9 [#25] Adds confirmation dialog to reharvest button 2013-03-21 10:56:39 +00:00
John Martin 3197162fe6 [#25] Changed 'Refresh' to 'Reharvest' on button 2013-03-21 10:36:12 +00:00
kindly a7583a7b8b Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0 2013-03-21 02:32:11 +00:00
kindly b676fb02e1 only get out harvest items in interface and when indexing 2013-03-21 02:31:34 +00:00
amercader 15c44d9aa8 Merge branch '23-harvest-form-cleanup' into release-v2.0 2013-03-20 17:03:41 +00:00
John Martin 4ba298fe58 [#23] Make labels a little wider on harvest new form 2013-03-20 14:07:03 +00:00
amercader 02e90767f4 Fix source listing in organization page
It needed update after #515 in ckan core
2013-03-20 13:01:23 +00:00
John Martin 86355fb9db [#23] Form cleanup after core bootstrap upgrade 2013-03-20 10:44:24 +00:00
kindly 634a0bbd30 return instead of continue 2013-03-19 01:21:20 +00:00
kindly 3adf38105e readd code from old branch seperating the fetch and import logic 2013-03-19 01:16:43 +00:00
amercader c2a6bd14eb Add auth function for harvest_source_show_status 2013-03-18 16:48:27 +00:00
amercader c76b7d95f3 Only count public datasets on the source status
This is more in line with what is done on the orgs/groups pages
2013-03-18 16:41:01 +00:00
amercader cb80ac784e Add logic to show private datasets to authorizaded users 2013-03-18 16:29:29 +00:00
amercader 341331ac53 Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2013-03-14 17:33:21 +00:00
amercader d77f16aba9 [#21] Improve gather stage error handling
See issue for full details. Basically we don't want to catch any
exception at the queue.py level, as they prevent debugging. Harvesters
should deal with them and return a list of ids or an empty list if no
objects need to be fetched.
Also improved the debug messages.
2013-03-14 17:31:07 +00:00
John Martin b30cc54427 Fix for add harvest source button within org 2013-03-14 14:45:54 +00:00
amercader 91f18bffab Fix pagination on org sources listing 2013-03-14 11:44:38 +00:00
amercader 8cac0977aa Fix import on org sources listing 2013-03-14 11:44:22 +00:00
amercader cd6c1b56a8 [#18] Get package dict on after_delete to check type
No need for #615 in core then
2013-03-13 17:31:39 +00:00
amercader 1b11b00946 [#18] Fix wrong logic for setting the source active field 2013-03-13 13:19:43 +00:00
kindly cb5e06119e Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0 2013-03-12 23:31:58 +00:00
kindly 06355ee6c4 Make IFacets work for harvest source related searches 2013-03-12 23:31:06 +00:00
amercader fab5b81c2c Pass context to functions handling harvest sources 2013-03-12 17:30:31 +00:00
amercader 5e50a5c9ad [#8] Update how state is handled for source objects 2013-03-12 15:35:49 +00:00
amercader 2ee3f33f51 [#18] Allow reactivation of sources
Due to #607 in CKAN core, once a source was deleted you could not
reactivate it again. As a workaround, if the source is deleted the
Delete button is not shown and the state select is, so you can set it to
'active'.
Also fixed wrong redirect after deletion.
2013-03-12 14:06:54 +00:00
amercader 23d1d5742c [#18] Update delete harvest source functionality
The harvest_source_delete logic function proxies to package delete,
which will delete the harvest source dataset. The harvest plugin then
hooks to the after_delete extension point in order to inactivate the
actual HarvestSource object and abort any pending jobs.
Also added the Delete button to the harvest source form.
2013-03-12 13:14:07 +00:00
amercader c957fdf17c Merge branch '14-template-tweaks' into release-v2.0 2013-03-08 14:49:43 +00:00
amercader ecceff48ed [#14] Use source.organization again after fix in 949bb6f 2013-03-08 14:48:49 +00:00
amercader 949bb6fe6a [#16] Add organization to source dict 2013-03-08 14:47:11 +00:00
John Martin f25ef19985 [#14] Fix for org breadcrumbs on sources 2013-03-08 12:48:11 +00:00
John Martin 2a53e4a2e4 [#14] Couple of minor template tweaks 2013-03-08 12:38:41 +00:00
joetsoi 7257258ca4 mark new harvest objects as current
When a new harvest_object for a new package was being created, it
was immediately being marked as false, as all objects were marked
as false, including the new object just created and newly marked
as current=true.

Fix so that old HarvestObjects are only marked as current=False
when updating an existing package.
2013-03-07 20:27:27 +00:00
John Martin 14e51ec587 Fix for removed snippet from ckan core 2013-03-07 11:52:59 +00:00
amercader 2ee27164c3 [#13] Remove or deprecate unused code
Mostly in controllers, dictization and plugin, either related to the old
templates pre-dataset type or old authorization.
2013-03-06 16:54:33 +00:00
amercader 6c02c87f8d [#13] Set routes to /harvest
Mostly painless as we (most of the time) were using DATASET_TYPE_NAME.
All old routes now point to the correct place in the new interface.
2013-03-06 16:33:46 +00:00
amercader eda280f266 Merge branch '12-org-source-listing' into 2.0-dataset-sources 2013-03-06 15:45:45 +00:00
amercader 889325dd9c [#12] Clean up and rename organization controller 2013-03-06 15:43:10 +00:00
amercader e9adaa7f91 [#12] Change URL for org sources list
Use "/organization/harvest_source/{id}", which will turn into
"/organization/harvest/{id}" soon
2013-03-06 15:38:38 +00:00
amercader 74633d0803 Fix error count in job stats
We want to take into account objects with errors that where created or
updated anyway (eg bbox errors), so we bascially query for the number of
objects that have object errors.

Also add the number of gather errors to this count.
2013-03-06 13:44:04 +00:00
amercader ef2defbcf9 [#7] Refactor job report page to include all errors 2013-03-06 13:43:40 +00:00
amercader bec31a611e Fix empty job finished date 2013-03-06 13:42:35 +00:00
amercader 04710fd1c6 Revert removal of filter in job list action in 7544d5c 2013-03-06 12:19:20 +00:00
John Martin c2b552b980 [#12] Better faceting for specifically harvest sources 2013-03-06 11:38:24 +00:00
John Martin 246898049e [#12] When harvest source listing is within org links goto edit pages 2013-03-06 11:36:24 +00:00
John Martin 9d149e4e5d [#12] Makes a harvest source admin page within org look a little nicer 2013-03-06 11:23:36 +00:00
kindly ca2df234d2 [#12] begin work on org harvest source controller 2013-03-06 04:11:31 +00:00
kindly 23aa45cc71 Merge branch '2.0-dataset-sources' into source_extra_config_validation 2013-03-06 01:10:48 +00:00
amercader d9a71f7c59 [#7] Fix wrong finish date on job listing 2013-03-05 18:56:30 +00:00
John Martin e566c96d62 [#7] Adds new harvest source button 2013-03-05 16:06:04 +00:00
John Martin 7544d5c5ef [#7] Removed faceted navigation for uneeded toggles in job reports 2013-03-05 15:23:42 +00:00
joetsoi e64c8ead0f fix print gather_errors 2013-03-05 12:49:20 +00:00
amercader 574c69fa9c Merge branch '2.0-dataset-sources' into 7-harvest-source-templates 2013-03-01 17:55:16 +00:00
amercader 182fbf054a Add XML declaration to contents if not present 2013-03-01 17:25:35 +00:00
amercader 5c17a525c1 Refresh session after each harvest stage
Otherwise the eg the source config got cached and you needed to restart
the consumers to refresh it.
2013-03-01 12:55:59 +00:00
amercader bd128ab58b Refresh session after each harvest stage
Otherwise the eg the source config got cached and you needed to restart
the consumers to refresh it.
2013-03-01 12:52:58 +00:00
amercader 3b6468b181 Merge branch '2.0-dataset-sources' of github.com:okfn/ckanext-harvest into 2.0-dataset-sources 2013-03-01 12:51:17 +00:00
joetsoi 9432368bea fix gather_stage if there is a previous job
change check on gather stage to check for changed packages since
last job instead of current harvest job's gather_start

fix attribute look up bug

fix print_job to print 0 gather_errors instead of key error
2013-02-28 19:06:21 +00:00
joetsoi ffce2c7915 Merge branch '2.0-dataset-sources' of github.com:okfn/ckanext-harvest into 2.0-dataset-sources 2013-02-28 18:11:12 +00:00
amercader 217d58d3a4 Merge branch 'source_extra_config_validation' of github.com:okfn/ckanext-harvest into source_extra_config_validation 2013-02-28 16:03:27 +00:00
amercader f28dc97f79 Fix bug in harvest job reports 2013-02-28 15:47:56 +00:00
amercader dab98112dc Fix bug in harvest job reports 2013-02-28 15:47:35 +00:00
kindly 871576f89c Merge remote-tracking branch 'remotes/origin/source_extra_config_validation' into source_extra_config_validation 2013-02-28 13:48:58 +00:00
kindly 9cef777e7b make sure config is also on top level 2013-02-28 13:46:16 +00:00
amercader e82410724a Merge branch '7-harvest-source-templates' into source_extra_config_validation 2013-02-28 12:18:09 +00:00
amercader f7cba69fe6 Merge branch '2.0-dataset-sources' into 7-harvest-source-templates 2013-02-28 12:17:47 +00:00
amercader a86d91c3f0 [#11] Make get actions side_effect_free 2013-02-28 12:17:15 +00:00
amercader fe6952ed00 Merge branch '7-harvest-source-templates' into source_extra_config_validation 2013-02-27 15:45:33 +00:00
joetsoi ba486a9482 add indexing of datasets whilst harvesting 2013-02-27 11:34:09 +00:00
John Martin d1b2b158b2 [#7] Harvest listing page and HTML/CSS cleanup
* I'm happy with /harvest_source/ now
* Also I've removed a load of undeeded CSS that wasn't really needed
* Also templates are now using core styles instead of custom ones
2013-02-27 11:14:04 +00:00
kindly e0a3eb7899 add javascript for source type 2013-02-25 18:12:47 +00:00
kindly 5b50126670 source extras field type 2013-02-25 18:07:34 +00:00
amercader efe977512b Include gather errors on job summaries and reports 2013-02-25 17:17:08 +00:00
amercader d1b71308af [#7] Minor tweaks in job pages 2013-02-25 16:15:37 +00:00
amercader c7bb897cdd [#7] Inactivate Refresh button if a new job alredy exists 2013-02-25 15:33:29 +00:00
amercader 57b3739dd4 [#7] Return most recent job on source status, not just finished 2013-02-25 15:32:39 +00:00
amercader 60f9360e84 [#7] Don't show job snippet in dashboard if no jobs 2013-02-25 13:11:08 +00:00
amercader 93e15dc529 [#7] Restrict access to source admin page 2013-02-25 13:10:30 +00:00
amercader 457b8d5988 [#7] 404 on last job if no jobs yet 2013-02-25 12:49:14 +00:00
amercader 34ae6be689 [#7] Fix dataset count on source page 2013-02-25 12:19:09 +00:00
amercader b3819e8df4 [#7] Use dict instead of domain object in templates 2013-02-25 12:18:30 +00:00
amercader 49a1c467cf Merge branch '7-harvest-source-templates' of github.com:okfn/ckanext-harvest into 7-harvest-source-templates 2013-02-25 12:04:34 +00:00
amercader e1d73c82f0 [#7] Make new routes more custom
In case we change the root name
2013-02-25 12:03:34 +00:00
kindly ebe246fe99 make report emit added so shows up on front end 2013-02-22 17:32:33 +00:00
amercader 57d6b3de74 [#7] Fix auth check on new source form
Auth check failed because source was undefined
2013-02-22 17:32:05 +00:00
kindly 52c0a5cbd6 Merge branch '2.0-dataset-sources' into 7-harvest-source-templates 2013-02-22 17:26:34 +00:00
joetsoi f97e3b4c6c add return True to import stage of ckanharvester
Was causing queue.py to report that the import had errored.
2013-02-22 10:13:36 +00:00
amercader 83f8cf69a6 Remove unnecessary extra quotes (see #381 on CKAN core) 2013-02-19 11:51:22 +00:00
John Martin 28e589ee92 [#7] Updates to the edit/new harvest source form 2013-02-12 16:29:07 +00:00
amercader 177349fd76 Update HarvesterBase
This is a convenience class that other harvesters can extend. Updates
include a cleanup of old functions and porting of enhancements from the
spatial harvesters.
2013-02-12 16:10:13 +00:00
John Martin 891f247181 [#7] Small template tweaks to job pages 2013-02-12 15:49:06 +00:00
amercader eaa8988440 [#4] Changes in schema to accommodate organizations
Basically handle the 'owner_org' field in form_to_db and db_to_form.
Added 'owner_org', 'frequency' (has default) and 'config' to surplus
keys in check_data_dict.
Also remove schema tweaks to let package_show call the appropiate schema
function.
2013-02-11 16:34:52 +00:00
John Martin bdc8206e8b [#7] Harvest job pages UX are complete 2013-02-08 17:19:04 +00:00
John Martin 7209723856 [#7] Admin templates now are in the correct places 2013-02-08 13:52:48 +00:00
John Martin 0aa1c1fcbc [#7] Re-jigged harvest source read pages 2013-02-08 12:15:14 +00:00
amercader 3c50a40a76 [#5] Fix auth for harvest_job_list (should forward to harvest_source_update) 2013-02-05 16:41:29 +00:00
amercader 413ef8786c [#5] Fix counts on jobs listing 2013-02-05 16:40:22 +00:00
amercader 5956e5a9d5 Merge branch '4-new-auth-for-2.0' into 5-improve-job-errors-reporting 2013-02-05 12:36:26 +00:00
amercader ca7819b885 Merge branch 'release-v2.0' into 2.0-dataset-sources 2013-02-05 12:35:14 +00:00
amercader cca554c5ec Fix typo and add missing column on v3 migration script 2013-02-05 12:33:56 +00:00
amercader e1ce0b7267 [#5] Allow not returning error summary on job dictize 2013-02-04 18:28:45 +00:00
amercader 8576ad6784 [#5] Add job listing page 2013-02-04 18:20:58 +00:00
amercader 22389fc52a [#5] Update report templates
The job details page has been updated to show the full error report, and
the whole report page has been dropped. All job details are loaded via a
snippet, which is also loaded on the harvest source page.

The frontend is still completely provisional.
2013-02-01 18:32:41 +00:00
amercader 42bace3628 [#5] Add new finished field for harvest job
When the run command flags a job as finished, it will query the most
recent harvest object for this job and use its import_finished value as
the job finishing time.
2013-01-28 17:19:28 +00:00
amercader 920f07cdf7 [#5] Cleanup the job controller actions 2013-01-28 16:32:53 +00:00
amercader c8e7086567 [#5] Change default auth for showing and listing jobs
Forward auth checks to harvest_source_update instead of
harvest_source_show, as job reports should only be visible to users that
can manage sources.
2013-01-28 16:31:11 +00:00
amercader ab78bf21b9 [#5] Fix typo in delete auth function 2013-01-28 16:15:38 +00:00
amercader 8431182f01 Document method and cleanup the interface file 2013-01-24 18:39:19 +00:00
amercader 676c7d34b6 [#5] Add method for returning the original URL for a document
Harvesters implementing IHarvester can define a `get_original_url`
method that should return a URL pointing to the original location of a
document in the remote server. If present, this URL will be used on the
job reports.

Examples:
* For a CKAN record: http://{ckan-instance}/api/rest/{guid}
* For a WAF record: http://{waf-root}/{file-name}
* For a CSW record: http://{csw-server}/?Request=GetElementById&Id={guid}&...
2013-01-24 18:35:43 +00:00
amercader d4b6dcb7f6 [#5] Add helper function for generating a link to a harvest object 2013-01-24 18:21:05 +00:00
amercader daa9a385ff Update job keys changed on 9ba6e8f 2013-01-24 17:36:58 +00:00
amercader 30d58b2b7b [#5] Preliminary job report logic function and page (WIP) 2013-01-23 18:04:19 +00:00
amercader 234f9f4cc0 [#5] Add job summary page
Shows dataset and error counts, job details and a summary of the more
frequent errors.
2013-01-23 17:33:44 +00:00
amercader b2b89dfd61 Add command for reindex all harvest sources 2013-01-22 16:43:36 +00:00
amercader 0d79252a09 Add command for reindex all harvest sources 2013-01-22 16:43:25 +00:00
amercader 6c861afe39 Update template with new harvest source status 2013-01-22 16:37:31 +00:00
amercader 9ba6e8f3b3 [#5] Add error summary to harvest_job_dictize
It will return the counts for the 20 most common errors for that
particular job. These will available when calling harvest_job_show.

Also refactor the harvest source status object to just call
harvest_job_dictize on the 'last_job' key, as it has all the
interesting fields anyway.
2013-01-22 13:13:24 +00:00
amercader 30c9eedf5f Improve harvest source status creation
Use report_status field to improve speed, remove unnecessary fields.
2013-01-17 15:43:45 +00:00
amercader bfce5185f0 [#4] Add db_to_form_schema_options to harvest plugin to avoid validation on show 2013-01-16 17:45:33 +00:00
amercader 2ab10afcf9 [#4] Fix typo in auth functions 2013-01-16 12:56:58 +00:00
amercader 2f4cd3a4b0 [#4] Fix logic functions importer 2013-01-15 19:29:17 +00:00
amercader 2bb669af21 [#4] Add owner_org field to schema and form
This should store the owner organization id.

Also added the errors box on the form.
2013-01-10 12:23:01 +00:00
kindly acb17ff3b0 capture errors more cleanly 2013-01-10 10:48:48 +00:00
amercader e49dd94b34 [#4] Remove authorization functions for the publisher profile
The different profiles will be now configured via the harvest source
datasets on CKAN core, so they are no longer needed.
2013-01-09 17:35:47 +00:00
amercader 288e1429a6 [#4] Remove the loading of different authorization profiles
The different profiles will be now configured via the harvest source
datasets on CKAN core, so it is no longer needed.

Also simplify IActions and IAuthFunction hook calls.
2013-01-09 17:32:05 +00:00
amercader 058dcad435 [#4] Minor change on the state field to fix a bug on harvest_source_show 2013-01-09 17:31:30 +00:00
amercader a866445023 [#4] Refactor authorization functions
The authorization functions have been refactored to take into account
both the new organizaton based authorization on CKAN core and the
harvest source datasets.

Basically at the source level, authorization checks are forwarded to the
relevant package auth function (package_create, package_update, etc.)
wich will check for organizations membership, sysadmin, etc.

Also we only use functions available on the plugins toolkit whenever
possible.
2013-01-09 17:26:48 +00:00
amercader 1342463f8a Merge branch '2.0-dataset-sources' into 4-new-auth-for-2.0
Conflicts:
	ckanext/harvest/logic/action/get.py
2013-01-09 11:09:34 +00:00
amercader 6b23082010 Move logic from setup_template_variables to helper functions 2013-01-09 11:07:44 +00:00
kindly 7b6beb1470 fix wrong authorization logic 2012-12-24 22:34:37 +00:00
kindly 01dfda59b6 Merge branch 'release-v2.0' into 4-new-auth-for-2.0 2012-12-24 12:46:56 +00:00
kindly 36389e7ce0 make sure gather phase finishes job if there is a severe error 2012-12-24 12:21:21 +00:00
amercader 43950aa4ff Merge branch 'release-v2.0' into 4-new-auth-for-2.0
Conflicts:
	ckanext/harvest/logic/action/get.py
	ckanext/harvest/tests/test_queue.py
2012-12-20 16:38:57 +00:00
amercader fdac761fba Merge branch 'release-v2.0' into 2.0-dataset-sources
Conflicts:
	ckanext/harvest/logic/action/get.py
	ckanext/harvest/tests/test_queue.py
2012-12-20 16:16:30 +00:00
amercader 19cd80b264 [#4] Fixes on the auth layer against the new core auth
Thanks @locusf for the original patch
2012-12-20 16:09:26 +00:00
amercader 510e2d3725 Fix pager links in harvest source page 2012-12-19 17:27:05 +00:00
kindly b940baacc0 make statistics use new report_field 2012-12-18 02:39:14 +00:00
kindly 6b42d96fe0 add report_status field 2012-12-17 23:50:26 +00:00
kindly 596b9bb475 fix auth to use new sysadmin flag 2012-12-17 23:46:43 +00:00
amercader 478326922b Fix tests
* Adapt test_queue to harvest source datasets
* Don't use the same mock harvester on different datasets as it messes
  the tests up
* Skip auth tests for the time being
2012-12-14 14:52:19 +00:00
amercader 6df525a377 Reindex the harvest source dataset after finishing jobs
This ensures that the status details shown on the harvest sources search
page is up to date (as it is loaded from the indexed data_dict)
2012-12-14 14:27:55 +00:00
amercader c1b0415cb6 Merge branch 'release-v2.0' into 2.0-dataset-sources
Conflicts:
	ckanext/harvest/model/__init__.py
2012-12-13 18:33:59 +00:00
amercader d57e73458a Make harvest object - package FK deferrable
Allows eg to add the harvest object id to the package dict before
indexing.
2012-12-13 18:21:40 +00:00
amercader b424ba1cea Add flag to avoid returning all objects when getting a job 2012-12-13 18:20:49 +00:00
amercader 0dde483992 Set job status to Finished when actually finishing it
Until now, harvest jobs were set to Finished just after sending all
objects to the fetch stage. Now every time the run command is run, jobs
are set to Running, and all previous Running jobs are checked to see if
all harvest objects have a state of Complete or Error. Only then the job
is flagged as Finished.
2012-12-13 18:19:22 +00:00
amercader 81c3881a1a Add active field to source dict 2012-12-13 18:00:07 +00:00
amercader 37efb3b978 Set harvest object state depending on the output of import_stage
Either to COMPLETE or ERROR, depending on whether it returns True or
False.
2012-12-13 14:30:13 +00:00
amercader 4da64a84ae Add more elements to the harvest sources page (still provisional) 2012-12-12 18:49:38 +00:00
amercader e0f3d47cb9 Add extra information to the harvest source page
The status object gives extra information about the source and there is
a helper function to build the dataset list for this particular source.
TODO: Pager still needs fixing.
2012-12-12 11:54:50 +00:00
amercader b567e562f4 Add after_show extension point
We hook into the package_show extension point in order to:

1. For harvest_source type datasets, add extra information about the
source, jobs, etc (calling harvest_source_show_status)
2. For normal datasets, check if they were harvested, and if so, add a
reference to the harvest object and harvest source.
2012-12-12 11:49:55 +00:00
amercader 2557636994 Update endpoints to receive the context object 2012-12-12 11:47:57 +00:00
amercader 8e1621731b Move harvest source status function as a logic function
The status dict is added automatically to harvest source packages.
Note that the actual queries still need to be updated as they proabably
won't scale.
2012-12-12 11:45:13 +00:00
amercader b0407bb2ac Update harvest_source_show logic function 2012-12-11 12:49:05 +00:00
amercader fcbe6aa6de Script for creating harvest source datasets on old versions
The way we check whether datasets need to be created might need to be
improved.
2012-12-05 18:54:28 +00:00
amercader 22ec9cb5af Fix old controller import 2012-12-05 18:53:35 +00:00
amercader 697933f8d0 Add custom harvest source read page (provisional) 2012-12-05 15:47:02 +00:00
amercader 2dba7fbf78 Add custom harvest sources search page 2012-12-05 14:51:20 +00:00
amercader a605564a41 Fix links to harvest sources page 2012-12-05 13:01:56 +00:00
amercader d77bf255b4 Finish up create and edit forms, including breadcrumbs, links, etc 2012-11-30 18:53:13 +00:00
amercader 9d83322591 Fix config validator and add tests 2012-11-30 17:02:06 +00:00
amercader 803b228d1c Update harvest source create and update logic functions
`harvest_source_create` and `harvest_source_update` now call
`package_create` and `package_update` respectively, making sure to
define a 'harvest_source' type. The returned dict uses the db_to_form
schema.
2012-11-30 14:11:24 +00:00
amercader 0e0aed0503 Clean up schemas
Better naming, remove old ones, ignore __extras field
2012-11-30 13:20:37 +00:00
amercader 875a773f1c Check if type property is actually there 2012-11-30 11:10:21 +00:00
amercader 7db09fceb0 Various fixes for the harvest source dataset type forms
Add a db to form schema to show the fields stored in extras. Validate
the source url on the Package object.
2012-11-29 16:57:20 +00:00
amercader ab7a379058 Behind the scenes creation and updating of HarvestSource objects
Taking advantage of the new after_create/after_update extensions points,
the extension checks if the dataset type is harvest source and creates
or updates the corresponding HarvestSource object. When creating a new
one, it will use the same id as the dataset.
2012-11-29 16:48:44 +00:00
amercader 9d36fd6841 First stub of the new dataset type forms
Adds a 'harvest_source' dataset type that mimics the original harvest
source form.
It works against the 3022 branch on CKAN core.
2012-11-29 12:31:48 +00:00
amercader 866fd69730 Do not remove XML declaration and add utf-8 charset to headers 2012-11-20 15:43:39 +00:00
amercader c52ed3b163 Add line field to object error table 2012-11-20 11:29:58 +00:00
amercader 03fd1884f4 Implement retry times for harvest objects 2012-11-15 18:11:35 +00:00
kindly 202c9d9fcc use correct queue for gather stage 2012-11-15 14:21:09 +00:00
kindly c9c1eb4848 use generator to consume 2012-11-15 14:14:55 +00:00
amercader 33d5e09722 Change fetch_callback to proper acknowledge objects 2012-11-15 11:36:06 +00:00
amercader 13357893ad Fix typo 2012-11-13 14:41:38 +00:00
amercader 54ff0526bb Return original document if present when requesting an object 2012-11-13 12:06:36 +00:00
amercader 820443d58f Add cascade option to harvest object extras and errors 2012-11-09 14:52:34 +00:00
kindly 5063626554 make sure state is changed to error on fetch error 2012-11-07 09:53:16 +00:00
kindly 28e5e9137a add perge queues command 2012-11-07 09:51:25 +00:00
kindly 6db65b5826 made manual default not null 2012-11-05 13:17:32 +00:00
amercader fdf01c09f2 Fix wrong check for harvest sources 2012-11-01 14:12:45 +00:00
amercader d598c0707b Ignore frequency field on the frontend for the time being 2012-11-01 14:12:01 +00:00
amercader d7f8c9165c Merge branch 'model_upgrade' into release-v2.0 2012-10-30 18:07:24 +00:00
amercader d502b925a6 Remove old deprecated tests and some whitespace 2012-10-30 18:07:05 +00:00
amercader a136cbf202 Fix typos in migration script 2012-10-30 17:52:10 +00:00
amercader 61b99e8eff Merge branch 'pika' into release-v2.0 2012-10-30 17:31:30 +00:00
amercader 82a498d9fc Rename function to be implementation independent 2012-10-30 17:13:39 +00:00
kindly 2529a17304 add jobs at certain frequencies 2012-10-29 17:15:02 +00:00
kindly 9fc0ae9937 add next run field 2012-10-26 10:50:35 +01:00
kindly bc079c6644 model upgrade with tests and migration 2012-10-25 19:01:54 +01:00
kindly 1153c1c5c9 add full queue test and new test harvester 2012-10-24 11:58:00 +01:00
kindly da125cdcc2 pika now used as queue library 2012-10-24 00:34:32 +01:00
amercader 8233b2ec23 Strip spaces from url when creating or updating a source 2012-08-17 12:25:06 +01:00
amercader c1f83e0d3e Strip spaces from url when creating or updating a source 2012-08-17 12:24:41 +01:00
tobes a17e8208de Very small text fix 2012-08-16 09:30:04 +01:00
tobes 7e940b497d Text message minor fix 2012-08-16 09:27:36 +01:00
tobes b6a32fd23b Add descriptions for sources 2012-08-16 09:16:34 +01:00
tobes c984727de5 Minor template tidy 2012-08-16 08:56:25 +01:00
tobes 5b7a9c0855 Flash messages to notices plus translatable 2012-08-16 08:49:35 +01:00
amercader 19ea538097 [#2852,#2853] Reword errors 2012-08-15 18:28:08 +01:00
amercader 7609a93422 Minor css tweaks on the forms 2012-08-15 18:26:36 +01:00
tobes c6c4f6d098 Remove about text placeholder 2012-08-15 10:40:52 +01:00
tobes 3b8075b670 Only specify autoform items once 2012-08-14 18:01:29 +01:00
tobes e1c74bdbe6 Fixes for autoform extra_text 2012-08-14 17:56:49 +01:00
tobes 8f6bab104e Dirty form changes pending cleanup 2012-08-14 17:33:32 +01:00
amercader 1979517706 Widen url field 2012-08-14 12:02:02 +01:00
amercader a76140650d Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2012-08-14 11:40:35 +01:00
amercader 4b68e4c31b Fix details page template and style 2012-08-14 11:23:56 +01:00
amercader eb12152089 Fix index page template and style 2012-08-14 11:04:17 +01:00
David Raznick 4b4e5dba62 fix broken show form 2012-08-14 00:44:00 +02:00
tobes 7efca28c22 Template updates 2012-08-10 13:05:54 +01:00
tobes 5557da653f First draft of new source page 2012-08-10 10:06:37 +01:00
tobes 3feca92d55 First draft of index page 2012-08-10 10:00:02 +01:00
tobes d8a98fd64a Move to new plugins model 2012-08-10 09:59:18 +01:00
amercader a8aebac965 Fix the harvest object show call 2012-08-09 13:38:17 +01:00
amercader bb5ba43ebb Allow showing harvest objects by default (on the default auth profile) 2012-08-09 13:37:28 +01:00
amercader 4c562e5f5f Do not store the object when importing 2012-08-09 11:17:41 +01:00
amercader e4b3cb440c Do not use repo.are_tables_created
When checking whether the core tables have been alredy created  it is
best to use package_table.exists(), as are_tables_created reflects the
tables, causing conflicts with other extensions.

This allows ckanext-harvest and ckanext-spatial to be used together on
ckan 1.8 onwards.
2012-08-09 11:06:05 +01:00
amercader 4d2fdeac57 Allow defining segments of harvest objects to import
Useful when importing large number of objects, as it allows
parallelization
2012-08-02 18:41:59 +01:00
amercader 7011efe5dc Allow not linking to datasets when importing records
With the -j flag, harvest objects are not linked to datasets when
importing. This is useful sometimes when importing records for the first
time.
2012-07-30 12:11:55 +01:00
David Read 203bcb053b Status can have links in it now. 2012-07-23 16:15:11 +01:00
David Read a61ea06faf Merge branch 'master' of github.com:okfn/ckanext-harvest 2012-07-19 15:27:04 +01:00
David Read 1a4e43a2a9 Status message added - change config to set the text. 2012-07-19 15:17:50 +01:00
amercader 4d00e665f1 [cli] Speed up run command 2012-06-29 11:32:18 +01:00
David Read 5df2b64dda Merge branch 'master' of github.com:okfn/ckanext-harvest 2012-06-15 18:38:33 +01:00
David Read c0a9965b52 Reword warning. 2012-06-15 18:38:22 +01:00
Sean Hammond 528e98120c [#2533] Fix some imports broken by ckan cleanups 2012-06-15 12:08:35 +02:00
David Read ccf0cd3da2 Add copious logging to record what happens in harvesting. 2012-06-08 17:09:22 +01:00
David Read a5fac2ac86 Added logic for getting a harvest source when viewing a dataset. 2012-06-01 17:03:40 +01:00
David Read 5151f4ee23 In publisher auth mode, any member of the group can make the changes. This brings things in line with the general idea that Admins have the power to do this plus authorize other editors/admins. 2012-05-29 15:21:34 +01:00
David Read 017222afd2 Withdrawn language introduced for uklp. 2012-05-28 18:41:03 +01:00
David Read 103eea9d50 Improve layout of sources to span page. 2012-05-28 18:13:34 +01:00
amercader b1a4cc7721 [ui][xs] Nicer button 2012-05-10 11:53:19 +01:00
amercader 3941b691f4 [ui] Fix style for CKAN 1.7 2012-05-09 16:03:03 +01:00
amercader f0a09e8299 [logic] Fix small bug in dictization 2012-05-09 15:58:23 +01:00
David Read 718202d886 Logging is now assured. 2012-04-10 20:53:29 +01:00
David Read 00e911a70c Fix name of the queue logger. Moved imports of ckanext.harvest until after _load_config so that the loggers do not start disabled. 2012-04-10 20:10:17 +01:00
amercader e797f50a05 [cli] Fix create job command 2012-03-19 17:28:53 +00:00
amercader 38a9a03355 [logic] Fix variables naming 2012-03-19 17:01:20 +00:00
Ian Murray 1145e6ea72 [master][auth/publisher] Check for 'ignore_auth' in harvest_object_show
Use case: In ckanext-dgu we want to index the harvest_object.content field.  As indexing is done synchronously we need
to provide a way for that harvest_object to be accessed when the current http request is made by a non-sysadmin user.
2012-03-15 18:14:57 +00:00
Ian Murray 7f10418f44 [master][auth] get_obj_object() function was missing 2012-03-15 18:09:44 +00:00
amercader 871eae94b6 [ckan harvester] Fix bug on force all check 2012-03-15 11:31:12 +00:00
amercader f210455aef [ckan harvester] Replace title on default extras 2012-03-13 12:38:14 +00:00
amercader e0bef2ef9c [base] Minor fix for harvesters without config 2012-03-12 14:46:28 +00:00
amercader 4fe38ec49d [tests,auth] Add tests for the auth profiles
Note that only the tests related to the currently loaded auth
profile will be run.
2012-03-08 16:14:44 +00:00
amercader 4a7007460b [logic] Fix broken imports 2012-03-07 17:08:17 +00:00
amercader 763f07fcad [logic,cli] Add session to the context in cli commands 2012-03-07 15:20:49 +00:00
amercader bf6df2dcd6 Fix merge 2012-03-07 15:04:50 +00:00
amercader 9fcaefe8ff [ui] Fix source datasets paging 2012-03-07 15:03:33 +00:00
amercader 6cccbb61c9 Bug fix, new job count property had not been updated 2012-03-07 12:10:32 +00:00
amercader 124f3191c8 [ui] Add class to config fields so they can be hidden via CSS 2012-03-07 11:56:18 +00:00
amercader d9cfc52643 [ui,auth] Aggregate sources by publisher on the sources list 2012-03-07 11:49:12 +00:00
amercader 97b390f3c1 [auth,logic,ui] Handle publishers on the UI
Add fields for publishers in the form when using the publihser auth
profile. Some changes related to the source schema.
2012-03-06 16:01:43 +00:00
amercader aea785701f [logic,auth] Check that users actually exist 2012-03-06 10:37:31 +00:00
amercader f0e2521d9b [logic,auth] Modify checks to ensure users are admins of their publishers 2012-03-06 10:16:27 +00:00
amercader d98206858d [plugin,auth] Check on startup if ckan is also using the publisher profile 2012-03-05 17:10:02 +00:00
amercader 2a2397c0ed [logic,auth] Implement publisher auth profile
The publisher profile allows general users to handle harvest sources
based on membership to a certain group (publisher), as opposed to the
default auth profile where only sysadmins can perform any harvesting
task.
To enable it, put this directive in your ini file:

    ckan.harvest.auth.profile = publisher

TODO:
 * Save publisher id / user id when creating sources
 * Show publisher in form and index page
2012-03-02 16:49:39 +00:00
amercader 3b68298bba [logic,auth] Use the site user for CLI commands auth checks 2012-03-01 12:46:42 +00:00
amercader a35eb75440 [logic,auth] Add auth logic layer
The first version of the auth layer is based on the current policy, i.e.
you need to be sysadmin to perform any action.

TODO: the CLI is still not working.
2012-03-01 12:02:16 +00:00
amercader c798013752 [logic] Refactor the rest of the logic functions (create,update,delete) 2012-02-29 15:20:35 +00:00
amercader 651474e9f1 [logic] Refactor logic layer to follow CKAN core conventions
To make maintenance easier and better support the upcoming auth checks,
the logic layer has been refactored to mimic the structure of the one on
CKAN core: separate actions and dictize functions and logic functions
receive a context.
Only get functions are included in this commit.
2012-02-29 10:59:02 +00:00
amercader 50537a6738 Merge branch 'master' into enh-1726-harvesting-model-update 2012-02-15 12:01:15 +00:00
amercader e03c2545ca [ui,logic] Expose source title in the source form 2012-02-15 11:49:59 +00:00
amercader 3489a004ad [ui] Minor tweak to support older themes 2012-02-14 17:23:17 +00:00
amercader 2990353533 [ui,logic] Expose source state (active/inactive) in the source form 2012-02-14 14:24:32 +00:00
amercader 4d7b8143b9 [lib] Renable unique constraint in url for inactive sources 2012-02-14 11:28:11 +00:00
amercader 9ed152cbea [ckan harvester] Add support for forcing gathering of all remote packages 2012-02-03 17:54:34 +00:00
amercader a5cf445fa6 [#1727][lib] Use 'current' field in queries returning harvest objects 2012-02-02 13:20:03 +00:00
amercader 479750da09 [#1726][base harvester] Set current field when importing 2012-02-02 13:18:43 +00:00
amercader 4c81c7c3a7 [#1726][model] Harvest source reference compatibility
The 'source' property of harvest objects now comes from the actual
foreign key. For compatibility with old harvesters, an before insert
event listener has beeen added to check if the source id has been set,
and set it automatically from the job if not.
Note that this requires SQLAlchemy 0.7 (ie CKAN 1.5.1)
2012-02-01 12:52:52 +00:00
amercader 004210935a [model] Avoid unicode warning 2012-02-01 11:10:44 +00:00
amercader b64d97118c [#1726][model] Add scripts for populating source_id and current fields 2012-02-01 11:08:41 +00:00
amercader d1783f5415 [model] Changes in harvest model
Added three changes to the harvest model:

 * 'title' column in harvest_source table
 * 'current' column in harvest_job table
 * foreign key from harvest_object to harvest_source

Tables are checked on startup to see if they need to be updated.
TODO: populate current and harvest_source_id fields
2012-01-30 18:38:35 +00:00
amercader f086e908bc [model] Clearer table initialization 2012-01-30 17:09:28 +00:00
amercader a997e45470 [lib] Ignore deleted packages in source stats 2012-01-25 17:47:35 +00:00
amercader 3a489bbb82 [ui] Cleanup sources list and details page 2012-01-24 16:55:47 +00:00
David Read 0f8c607187 [tests]: Another test moved in wholesale from dgu repo. 2012-01-11 10:35:37 +00:00
David Read 81ed69c4da [tests]: Moved to this repo test code from dgu repo that might be useful or might not. Completely broken, but maybe be worth something. 2012-01-11 10:29:05 +00:00
amercader a53b79c181 [ui] Show edit and refresh links in source page 2012-01-10 17:24:05 +00:00
amercader 2ad29df5c5 [lib] Fix bug: couldn't delete source conf 2012-01-10 17:15:56 +00:00
amercader eb646b3385 [ckan harvester] Add support for defining default extras 2012-01-10 17:07:19 +00:00
amercader ae51093213 [ckan harvester] Ignore __junk field, was causing imports to fail 2012-01-10 14:46:12 +00:00
Adrià Mercader da469ab08e [base harvester] Custom tag munge function. TODO: check with flexible tags 2011-11-23 11:05:52 +00:00
Adrià Mercader cfaba6e1e8 [ckan harvester] Add support for sending an API key 2011-11-21 17:29:10 +00:00
Adrià Mercader 0ab5c53b47 [ckan harvester] Fix typo 2011-11-18 17:53:01 +00:00
Adrià Mercader f02ee45aae [ui] Show config options in harvest source details page 2011-11-18 14:35:46 +00:00
Adrià Mercader 994590531e [ckan harvester] Support for creating read-only packages 2011-11-18 14:30:10 +00:00
Adrià Mercader c939d90dbb [ckan harvester] Support for defining a custom user to do the harvesting 2011-11-18 14:12:30 +00:00
Adrià Mercader 2018d9e513 [ckan harvester] Support for default tags and groups 2011-11-18 13:20:41 +00:00
Adrià Mercader 8ec05bc3e3 Modify import command to avoid problems with the Session 2011-11-15 11:26:24 +00:00
Adrià Mercader c04d80e27e Use get_action function instead of directly calling the action functions 2011-10-26 17:26:18 +01:00
David Raznick 31dac7029a fix to make import stage work on its own 2011-09-28 14:27:28 +01:00
Adrià Mercader 63afd199a9 Add link to source documents from object errors 2011-09-08 10:27:36 +01:00
Adrià Mercader 31c1ea1c21 Separate gather and object errors in reports. Add info about guid and object id in the object ones 2011-09-08 09:58:21 +01:00
Adrià Mercader c36d9bdd8e Add new command to create new jobs for all active sources 2011-09-06 18:25:17 +01:00
David Read dd00e98d9d [model]: More careful about creating tables, since paster db upgrade loads the envrionment and therefore runs setup() before it does the migrations, and therefore in this instance we do not want to create the db tables. 2011-08-10 16:25:57 +01:00
Adrià Mercader 7927329536 Make harvesters work with latest ckan release 2011-07-29 11:31:03 +01:00
Adrià Mercader cabbb4922d Use API version defined in config if present 2011-07-18 17:35:32 +01:00
Adrià Mercader c867660e7d Add docs to base harvester functions 2011-07-18 17:35:03 +01:00
Adrià Mercader 54de6759fe Fix bug with empty config 2011-06-28 15:04:40 +01:00
Adrià Mercader c80e68a12f Ensure the correct configuration is used on each stage 2011-06-14 15:59:13 +01:00
Adrià Mercader 3125bb1514 Add a check to ensure sources with no packages are reharvested 2011-06-14 12:59:48 +01:00
Adrià Mercader 2ac9885150 Page packages in the harvest source details page 2011-06-14 10:27:48 +01:00
Friedrich Lindenberg 0d9d1f8096 reduce number of queries for harvest index to a less insane number. still heavy. 2011-06-13 17:36:35 +02:00
Adrià Mercader ef04ce1774 Add support for config options in CLI 2011-06-13 15:56:19 +01:00
Friedrich Lindenberg 13f2fb3b96 use hasattr for config validation 2011-06-09 11:35:58 +02:00
Adrià Mercader 7b61fb62bf Ignore missing config values 2011-06-07 15:32:46 +01:00
Adrià Mercader 98bfd50f47 Load config in the CKAN harvester 2011-06-07 13:35:11 +01:00
Adrià Mercader 6e75d362e3 Add a simple way for harvesters to store configuration options. If form_config_interface is Text on the info dictionary, the configuration field will be enabled in the form. Harvesters can also provide a validate_config method. 2011-06-07 12:07:53 +01:00
Adrià Mercader ca6af0249a Reverting previous changeset, as it conflicts with dgu_form_api 2011-06-07 11:58:35 +01:00
Friedrich Lindenberg f9c0ee37aa spacing in template paths 2011-06-06 10:16:34 +02:00
Friedrich Lindenberg 89934b8538 [harvesters] factor out a base harvester for use in generic harvesting apps 2011-06-02 12:07:07 +02:00
David Raznick 79fd966573 add database setup at configure time 2011-05-31 18:06:26 +01:00
David Raznick 264b606c48 take tables out of global scope at import time 2011-05-31 18:02:07 +01:00
Adrià Mercader 2b98080266 [merge] from new-forms, as forms refactoring has been merged in core 2011-05-20 13:50:15 +01:00
Adrià Mercader 235d822458 [ckan harvester] Request only packages modified since last harvest job. Also support older versions which do not include 'metadata_modified' 2011-05-17 17:26:42 +01:00
Adrià Mercader 565eaf3d0a Add a new info method to the harvester interface so implementations can provide details. Use this to build the WUI form 2011-05-13 18:39:36 +01:00
Adrià Mercader fecee82b1a Minor enhancements in the WUI 2011-05-13 17:02:18 +01:00
Adrià Mercader b3a88070e3 [forms] Adapt CLI commands to changes in lib 2011-05-13 16:00:36 +01:00
Adrià Mercader bbe459527f [forms] Major refactoring of the harvest forms. Forms no longer use the DGU form
API, and are handled similarly to the new ones on CKAN core (logic, schema,
validators...). The UI is also more consistent with the CKAN one.
2011-05-13 14:17:58 +01:00
Adrià Mercader 26cdc1089d Change date definitions in Harvest Objects. reference_date -> metadata_modified_date, created -> gathered 2011-05-11 17:07:05 +01:00
Adrià Mercader e1080e349e Set a flag to force harvesters to import objects 2011-05-10 17:11:12 +01:00
Adrià Mercader e320d0588f Add command to reimport existing harvest objects 2011-05-10 16:06:57 +01:00
Adrià Mercader f7c6854a1d Save reference date in Harvest Objects when harvesting CKAN instances 2011-05-10 12:57:57 +01:00
Adrià Mercader 329ca2dd29 Add a reference date to the Harvest Objects. This must be set during the harvest
process.
2011-05-10 11:05:44 +01:00
Adrià Mercader c697bc3350 Normalize https ports too (#736) 2011-05-09 18:47:30 +01:00
Adrià Mercader 5594f22be7 More robust URL checking (#736) 2011-05-09 14:03:46 +01:00
Adrià Mercader 0e56c0ab4f Abort pending jobs when removing sources 2011-05-05 17:13:07 +01:00
Adrià Mercader e0a1e5752d Fix redirects in view controller 2011-05-05 16:47:34 +01:00
Adrià Mercader 43453b6938 Show GUID on object errors 2011-04-19 17:16:25 +01:00
David Raznick c9d43b2e4d overide default create schema 2011-04-19 15:34:56 +01:00
Adrià Mercader e3bca3ceee Add first version of the CKAN harvester [#985] 2011-04-19 14:54:59 +01:00