Commit Graph

416 Commits

Author SHA1 Message Date
amercader c7bb897cdd [#7] Inactivate Refresh button if a new job alredy exists 2013-02-25 15:33:29 +00:00
amercader 57b3739dd4 [#7] Return most recent job on source status, not just finished 2013-02-25 15:32:39 +00:00
amercader 60f9360e84 [#7] Don't show job snippet in dashboard if no jobs 2013-02-25 13:11:08 +00:00
amercader 93e15dc529 [#7] Restrict access to source admin page 2013-02-25 13:10:30 +00:00
amercader 457b8d5988 [#7] 404 on last job if no jobs yet 2013-02-25 12:49:14 +00:00
amercader 34ae6be689 [#7] Fix dataset count on source page 2013-02-25 12:19:09 +00:00
amercader b3819e8df4 [#7] Use dict instead of domain object in templates 2013-02-25 12:18:30 +00:00
amercader 49a1c467cf Merge branch '7-harvest-source-templates' of github.com:okfn/ckanext-harvest into 7-harvest-source-templates 2013-02-25 12:04:34 +00:00
amercader e1d73c82f0 [#7] Make new routes more custom
In case we change the root name
2013-02-25 12:03:34 +00:00
kindly ebe246fe99 make report emit added so shows up on front end 2013-02-22 17:32:33 +00:00
amercader 57d6b3de74 [#7] Fix auth check on new source form
Auth check failed because source was undefined
2013-02-22 17:32:05 +00:00
kindly 52c0a5cbd6 Merge branch '2.0-dataset-sources' into 7-harvest-source-templates 2013-02-22 17:26:34 +00:00
joetsoi f97e3b4c6c add return True to import stage of ckanharvester
Was causing queue.py to report that the import had errored.
2013-02-22 10:13:36 +00:00
amercader 83f8cf69a6 Remove unnecessary extra quotes (see #381 on CKAN core) 2013-02-19 11:51:22 +00:00
John Martin 28e589ee92 [#7] Updates to the edit/new harvest source form 2013-02-12 16:29:07 +00:00
amercader 177349fd76 Update HarvesterBase
This is a convenience class that other harvesters can extend. Updates
include a cleanup of old functions and porting of enhancements from the
spatial harvesters.
2013-02-12 16:10:13 +00:00
John Martin 891f247181 [#7] Small template tweaks to job pages 2013-02-12 15:49:06 +00:00
amercader eaa8988440 [#4] Changes in schema to accommodate organizations
Basically handle the 'owner_org' field in form_to_db and db_to_form.
Added 'owner_org', 'frequency' (has default) and 'config' to surplus
keys in check_data_dict.
Also remove schema tweaks to let package_show call the appropiate schema
function.
2013-02-11 16:34:52 +00:00
John Martin bdc8206e8b [#7] Harvest job pages UX are complete 2013-02-08 17:19:04 +00:00
John Martin 7209723856 [#7] Admin templates now are in the correct places 2013-02-08 13:52:48 +00:00
John Martin 0aa1c1fcbc [#7] Re-jigged harvest source read pages 2013-02-08 12:15:14 +00:00
amercader 3c50a40a76 [#5] Fix auth for harvest_job_list (should forward to harvest_source_update) 2013-02-05 16:41:29 +00:00
amercader 413ef8786c [#5] Fix counts on jobs listing 2013-02-05 16:40:22 +00:00
amercader 5956e5a9d5 Merge branch '4-new-auth-for-2.0' into 5-improve-job-errors-reporting 2013-02-05 12:36:26 +00:00
amercader ca7819b885 Merge branch 'release-v2.0' into 2.0-dataset-sources 2013-02-05 12:35:14 +00:00
amercader cca554c5ec Fix typo and add missing column on v3 migration script 2013-02-05 12:33:56 +00:00
amercader e1ce0b7267 [#5] Allow not returning error summary on job dictize 2013-02-04 18:28:45 +00:00
amercader 8576ad6784 [#5] Add job listing page 2013-02-04 18:20:58 +00:00
amercader 22389fc52a [#5] Update report templates
The job details page has been updated to show the full error report, and
the whole report page has been dropped. All job details are loaded via a
snippet, which is also loaded on the harvest source page.

The frontend is still completely provisional.
2013-02-01 18:32:41 +00:00
amercader 42bace3628 [#5] Add new finished field for harvest job
When the run command flags a job as finished, it will query the most
recent harvest object for this job and use its import_finished value as
the job finishing time.
2013-01-28 17:19:28 +00:00
amercader 920f07cdf7 [#5] Cleanup the job controller actions 2013-01-28 16:32:53 +00:00
amercader c8e7086567 [#5] Change default auth for showing and listing jobs
Forward auth checks to harvest_source_update instead of
harvest_source_show, as job reports should only be visible to users that
can manage sources.
2013-01-28 16:31:11 +00:00
amercader ab78bf21b9 [#5] Fix typo in delete auth function 2013-01-28 16:15:38 +00:00
amercader 8431182f01 Document method and cleanup the interface file 2013-01-24 18:39:19 +00:00
amercader 676c7d34b6 [#5] Add method for returning the original URL for a document
Harvesters implementing IHarvester can define a `get_original_url`
method that should return a URL pointing to the original location of a
document in the remote server. If present, this URL will be used on the
job reports.

Examples:
* For a CKAN record: http://{ckan-instance}/api/rest/{guid}
* For a WAF record: http://{waf-root}/{file-name}
* For a CSW record: http://{csw-server}/?Request=GetElementById&Id={guid}&...
2013-01-24 18:35:43 +00:00
amercader d4b6dcb7f6 [#5] Add helper function for generating a link to a harvest object 2013-01-24 18:21:05 +00:00
amercader daa9a385ff Update job keys changed on 9ba6e8f 2013-01-24 17:36:58 +00:00
amercader 30d58b2b7b [#5] Preliminary job report logic function and page (WIP) 2013-01-23 18:04:19 +00:00
amercader 234f9f4cc0 [#5] Add job summary page
Shows dataset and error counts, job details and a summary of the more
frequent errors.
2013-01-23 17:33:44 +00:00
amercader b2b89dfd61 Add command for reindex all harvest sources 2013-01-22 16:43:36 +00:00
amercader 0d79252a09 Add command for reindex all harvest sources 2013-01-22 16:43:25 +00:00
amercader 6c861afe39 Update template with new harvest source status 2013-01-22 16:37:31 +00:00
amercader 9ba6e8f3b3 [#5] Add error summary to harvest_job_dictize
It will return the counts for the 20 most common errors for that
particular job. These will available when calling harvest_job_show.

Also refactor the harvest source status object to just call
harvest_job_dictize on the 'last_job' key, as it has all the
interesting fields anyway.
2013-01-22 13:13:24 +00:00
amercader 30c9eedf5f Improve harvest source status creation
Use report_status field to improve speed, remove unnecessary fields.
2013-01-17 15:43:45 +00:00
amercader bfce5185f0 [#4] Add db_to_form_schema_options to harvest plugin to avoid validation on show 2013-01-16 17:45:33 +00:00
amercader 2ab10afcf9 [#4] Fix typo in auth functions 2013-01-16 12:56:58 +00:00
amercader 2f4cd3a4b0 [#4] Fix logic functions importer 2013-01-15 19:29:17 +00:00
amercader 2bb669af21 [#4] Add owner_org field to schema and form
This should store the owner organization id.

Also added the errors box on the form.
2013-01-10 12:23:01 +00:00
kindly acb17ff3b0 capture errors more cleanly 2013-01-10 10:48:48 +00:00
amercader e49dd94b34 [#4] Remove authorization functions for the publisher profile
The different profiles will be now configured via the harvest source
datasets on CKAN core, so they are no longer needed.
2013-01-09 17:35:47 +00:00
amercader 288e1429a6 [#4] Remove the loading of different authorization profiles
The different profiles will be now configured via the harvest source
datasets on CKAN core, so it is no longer needed.

Also simplify IActions and IAuthFunction hook calls.
2013-01-09 17:32:05 +00:00
amercader 058dcad435 [#4] Minor change on the state field to fix a bug on harvest_source_show 2013-01-09 17:31:30 +00:00
amercader a866445023 [#4] Refactor authorization functions
The authorization functions have been refactored to take into account
both the new organizaton based authorization on CKAN core and the
harvest source datasets.

Basically at the source level, authorization checks are forwarded to the
relevant package auth function (package_create, package_update, etc.)
wich will check for organizations membership, sysadmin, etc.

Also we only use functions available on the plugins toolkit whenever
possible.
2013-01-09 17:26:48 +00:00
amercader 1342463f8a Merge branch '2.0-dataset-sources' into 4-new-auth-for-2.0
Conflicts:
	ckanext/harvest/logic/action/get.py
2013-01-09 11:09:34 +00:00
amercader 6b23082010 Move logic from setup_template_variables to helper functions 2013-01-09 11:07:44 +00:00
kindly 7b6beb1470 fix wrong authorization logic 2012-12-24 22:34:37 +00:00
kindly 01dfda59b6 Merge branch 'release-v2.0' into 4-new-auth-for-2.0 2012-12-24 12:46:56 +00:00
kindly 36389e7ce0 make sure gather phase finishes job if there is a severe error 2012-12-24 12:21:21 +00:00
amercader 43950aa4ff Merge branch 'release-v2.0' into 4-new-auth-for-2.0
Conflicts:
	ckanext/harvest/logic/action/get.py
	ckanext/harvest/tests/test_queue.py
2012-12-20 16:38:57 +00:00
amercader fdac761fba Merge branch 'release-v2.0' into 2.0-dataset-sources
Conflicts:
	ckanext/harvest/logic/action/get.py
	ckanext/harvest/tests/test_queue.py
2012-12-20 16:16:30 +00:00
amercader 19cd80b264 [#4] Fixes on the auth layer against the new core auth
Thanks @locusf for the original patch
2012-12-20 16:09:26 +00:00
amercader 510e2d3725 Fix pager links in harvest source page 2012-12-19 17:27:05 +00:00
kindly b940baacc0 make statistics use new report_field 2012-12-18 02:39:14 +00:00
kindly 6b42d96fe0 add report_status field 2012-12-17 23:50:26 +00:00
kindly 596b9bb475 fix auth to use new sysadmin flag 2012-12-17 23:46:43 +00:00
amercader 478326922b Fix tests
* Adapt test_queue to harvest source datasets
* Don't use the same mock harvester on different datasets as it messes
  the tests up
* Skip auth tests for the time being
2012-12-14 14:52:19 +00:00
amercader 6df525a377 Reindex the harvest source dataset after finishing jobs
This ensures that the status details shown on the harvest sources search
page is up to date (as it is loaded from the indexed data_dict)
2012-12-14 14:27:55 +00:00
amercader c1b0415cb6 Merge branch 'release-v2.0' into 2.0-dataset-sources
Conflicts:
	ckanext/harvest/model/__init__.py
2012-12-13 18:33:59 +00:00
amercader d57e73458a Make harvest object - package FK deferrable
Allows eg to add the harvest object id to the package dict before
indexing.
2012-12-13 18:21:40 +00:00
amercader b424ba1cea Add flag to avoid returning all objects when getting a job 2012-12-13 18:20:49 +00:00
amercader 0dde483992 Set job status to Finished when actually finishing it
Until now, harvest jobs were set to Finished just after sending all
objects to the fetch stage. Now every time the run command is run, jobs
are set to Running, and all previous Running jobs are checked to see if
all harvest objects have a state of Complete or Error. Only then the job
is flagged as Finished.
2012-12-13 18:19:22 +00:00
amercader 81c3881a1a Add active field to source dict 2012-12-13 18:00:07 +00:00
amercader 37efb3b978 Set harvest object state depending on the output of import_stage
Either to COMPLETE or ERROR, depending on whether it returns True or
False.
2012-12-13 14:30:13 +00:00
amercader 4da64a84ae Add more elements to the harvest sources page (still provisional) 2012-12-12 18:49:38 +00:00
amercader e0f3d47cb9 Add extra information to the harvest source page
The status object gives extra information about the source and there is
a helper function to build the dataset list for this particular source.
TODO: Pager still needs fixing.
2012-12-12 11:54:50 +00:00
amercader b567e562f4 Add after_show extension point
We hook into the package_show extension point in order to:

1. For harvest_source type datasets, add extra information about the
source, jobs, etc (calling harvest_source_show_status)
2. For normal datasets, check if they were harvested, and if so, add a
reference to the harvest object and harvest source.
2012-12-12 11:49:55 +00:00
amercader 2557636994 Update endpoints to receive the context object 2012-12-12 11:47:57 +00:00
amercader 8e1621731b Move harvest source status function as a logic function
The status dict is added automatically to harvest source packages.
Note that the actual queries still need to be updated as they proabably
won't scale.
2012-12-12 11:45:13 +00:00
amercader b0407bb2ac Update harvest_source_show logic function 2012-12-11 12:49:05 +00:00
amercader fcbe6aa6de Script for creating harvest source datasets on old versions
The way we check whether datasets need to be created might need to be
improved.
2012-12-05 18:54:28 +00:00
amercader 22ec9cb5af Fix old controller import 2012-12-05 18:53:35 +00:00
amercader 697933f8d0 Add custom harvest source read page (provisional) 2012-12-05 15:47:02 +00:00
amercader 2dba7fbf78 Add custom harvest sources search page 2012-12-05 14:51:20 +00:00
amercader a605564a41 Fix links to harvest sources page 2012-12-05 13:01:56 +00:00
amercader d77bf255b4 Finish up create and edit forms, including breadcrumbs, links, etc 2012-11-30 18:53:13 +00:00
amercader 9d83322591 Fix config validator and add tests 2012-11-30 17:02:06 +00:00
amercader 803b228d1c Update harvest source create and update logic functions
`harvest_source_create` and `harvest_source_update` now call
`package_create` and `package_update` respectively, making sure to
define a 'harvest_source' type. The returned dict uses the db_to_form
schema.
2012-11-30 14:11:24 +00:00
amercader 0e0aed0503 Clean up schemas
Better naming, remove old ones, ignore __extras field
2012-11-30 13:20:37 +00:00
amercader 875a773f1c Check if type property is actually there 2012-11-30 11:10:21 +00:00
amercader 7db09fceb0 Various fixes for the harvest source dataset type forms
Add a db to form schema to show the fields stored in extras. Validate
the source url on the Package object.
2012-11-29 16:57:20 +00:00
amercader ab7a379058 Behind the scenes creation and updating of HarvestSource objects
Taking advantage of the new after_create/after_update extensions points,
the extension checks if the dataset type is harvest source and creates
or updates the corresponding HarvestSource object. When creating a new
one, it will use the same id as the dataset.
2012-11-29 16:48:44 +00:00
amercader 9d36fd6841 First stub of the new dataset type forms
Adds a 'harvest_source' dataset type that mimics the original harvest
source form.
It works against the 3022 branch on CKAN core.
2012-11-29 12:31:48 +00:00
amercader 866fd69730 Do not remove XML declaration and add utf-8 charset to headers 2012-11-20 15:43:39 +00:00
amercader c52ed3b163 Add line field to object error table 2012-11-20 11:29:58 +00:00
amercader 03fd1884f4 Implement retry times for harvest objects 2012-11-15 18:11:35 +00:00
kindly 202c9d9fcc use correct queue for gather stage 2012-11-15 14:21:09 +00:00
kindly c9c1eb4848 use generator to consume 2012-11-15 14:14:55 +00:00
amercader 33d5e09722 Change fetch_callback to proper acknowledge objects 2012-11-15 11:36:06 +00:00
amercader 13357893ad Fix typo 2012-11-13 14:41:38 +00:00
amercader 54ff0526bb Return original document if present when requesting an object 2012-11-13 12:06:36 +00:00
amercader 820443d58f Add cascade option to harvest object extras and errors 2012-11-09 14:52:34 +00:00
kindly 5063626554 make sure state is changed to error on fetch error 2012-11-07 09:53:16 +00:00
kindly 28e5e9137a add perge queues command 2012-11-07 09:51:25 +00:00
kindly 6db65b5826 made manual default not null 2012-11-05 13:17:32 +00:00
amercader fdf01c09f2 Fix wrong check for harvest sources 2012-11-01 14:12:45 +00:00
amercader d598c0707b Ignore frequency field on the frontend for the time being 2012-11-01 14:12:01 +00:00
amercader d7f8c9165c Merge branch 'model_upgrade' into release-v2.0 2012-10-30 18:07:24 +00:00
amercader d502b925a6 Remove old deprecated tests and some whitespace 2012-10-30 18:07:05 +00:00
amercader a136cbf202 Fix typos in migration script 2012-10-30 17:52:10 +00:00
amercader 61b99e8eff Merge branch 'pika' into release-v2.0 2012-10-30 17:31:30 +00:00
amercader 82a498d9fc Rename function to be implementation independent 2012-10-30 17:13:39 +00:00
kindly 2529a17304 add jobs at certain frequencies 2012-10-29 17:15:02 +00:00
kindly 9fc0ae9937 add next run field 2012-10-26 10:50:35 +01:00
kindly bc079c6644 model upgrade with tests and migration 2012-10-25 19:01:54 +01:00
kindly 1153c1c5c9 add full queue test and new test harvester 2012-10-24 11:58:00 +01:00
kindly da125cdcc2 pika now used as queue library 2012-10-24 00:34:32 +01:00
amercader 8233b2ec23 Strip spaces from url when creating or updating a source 2012-08-17 12:25:06 +01:00
amercader c1f83e0d3e Strip spaces from url when creating or updating a source 2012-08-17 12:24:41 +01:00
tobes a17e8208de Very small text fix 2012-08-16 09:30:04 +01:00
tobes 7e940b497d Text message minor fix 2012-08-16 09:27:36 +01:00
tobes b6a32fd23b Add descriptions for sources 2012-08-16 09:16:34 +01:00
tobes c984727de5 Minor template tidy 2012-08-16 08:56:25 +01:00
tobes 5b7a9c0855 Flash messages to notices plus translatable 2012-08-16 08:49:35 +01:00
amercader 19ea538097 [#2852,#2853] Reword errors 2012-08-15 18:28:08 +01:00
amercader 7609a93422 Minor css tweaks on the forms 2012-08-15 18:26:36 +01:00
tobes c6c4f6d098 Remove about text placeholder 2012-08-15 10:40:52 +01:00
tobes 3b8075b670 Only specify autoform items once 2012-08-14 18:01:29 +01:00
tobes e1c74bdbe6 Fixes for autoform extra_text 2012-08-14 17:56:49 +01:00
tobes 8f6bab104e Dirty form changes pending cleanup 2012-08-14 17:33:32 +01:00
amercader 1979517706 Widen url field 2012-08-14 12:02:02 +01:00
amercader a76140650d Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0 2012-08-14 11:40:35 +01:00
amercader 4b68e4c31b Fix details page template and style 2012-08-14 11:23:56 +01:00
amercader eb12152089 Fix index page template and style 2012-08-14 11:04:17 +01:00
David Raznick 4b4e5dba62 fix broken show form 2012-08-14 00:44:00 +02:00
tobes 7efca28c22 Template updates 2012-08-10 13:05:54 +01:00
tobes 5557da653f First draft of new source page 2012-08-10 10:06:37 +01:00
tobes 3feca92d55 First draft of index page 2012-08-10 10:00:02 +01:00
tobes d8a98fd64a Move to new plugins model 2012-08-10 09:59:18 +01:00
amercader a8aebac965 Fix the harvest object show call 2012-08-09 13:38:17 +01:00
amercader bb5ba43ebb Allow showing harvest objects by default (on the default auth profile) 2012-08-09 13:37:28 +01:00
amercader 4c562e5f5f Do not store the object when importing 2012-08-09 11:17:41 +01:00
amercader e4b3cb440c Do not use repo.are_tables_created
When checking whether the core tables have been alredy created  it is
best to use package_table.exists(), as are_tables_created reflects the
tables, causing conflicts with other extensions.

This allows ckanext-harvest and ckanext-spatial to be used together on
ckan 1.8 onwards.
2012-08-09 11:06:05 +01:00
amercader 4d2fdeac57 Allow defining segments of harvest objects to import
Useful when importing large number of objects, as it allows
parallelization
2012-08-02 18:41:59 +01:00
amercader 7011efe5dc Allow not linking to datasets when importing records
With the -j flag, harvest objects are not linked to datasets when
importing. This is useful sometimes when importing records for the first
time.
2012-07-30 12:11:55 +01:00
David Read 203bcb053b Status can have links in it now. 2012-07-23 16:15:11 +01:00
David Read a61ea06faf Merge branch 'master' of github.com:okfn/ckanext-harvest 2012-07-19 15:27:04 +01:00
David Read 1a4e43a2a9 Status message added - change config to set the text. 2012-07-19 15:17:50 +01:00
amercader 4d00e665f1 [cli] Speed up run command 2012-06-29 11:32:18 +01:00
David Read 5df2b64dda Merge branch 'master' of github.com:okfn/ckanext-harvest 2012-06-15 18:38:33 +01:00
David Read c0a9965b52 Reword warning. 2012-06-15 18:38:22 +01:00