amercader
1b458b1772
[ #96 ] Handle encoding errors on harvest object endpoint
...
When parsing the harvest object content to see if it is an XML file,
etree.fromstring would fail id there are incorrect unicode errors.
2014-04-28 12:48:09 +01:00
Richard Claydon
e3492b57e7
Update plugin.py
...
Updating plugin.py to check for the existence of the extras key in the data_dict.
2014-02-27 16:05:39 +00:00
amercader
d3cf5e58d1
[ #86 ] Fix duplicate extras
2014-02-11 18:16:49 +00:00
amercader
fbde0b8dc1
[ #87 ] Remove remote url_type from resources
...
Otherwise CKAN thinks they are uploads, datastore resources, etc, which
it can cause problems eg when displaying the URL of the resource. We
are just linking to the remote resource URL.
2014-02-11 17:27:19 +00:00
amercader
5739e541d7
[ #80 ] Support for Python 2.6 when handling xml exceptions
2014-02-10 18:44:46 +00:00
amercader
2a07a144fc
[ #84 ] Fix auth audit exception when creating datasets
...
This was caused by a combination of the auth audit leaking and the
harvester reusing the context for the package_show and package_create
actions. If the package is not found, package_show does not call
check_access, and the auth audit does not pass. This is stored in the
context (`__auth_audit`) and is raised next time that we call
get_action (when we call package_create with the same context)
It could potentially be fixed on master, but it is probably quite rare.
2014-02-10 18:22:48 +00:00
amercader
5b677b6099
[ #83 ] Fix key error when using default_groups
2014-02-10 13:16:58 +00:00
Rachel Knowler
bf11e4d330
Moved clean_tags check into _create_or_update_package method.
2014-02-10 09:29:01 +01:00
Rachel Knowler
2ba9908653
Config option to munge tags changed to be consistent with other config options in this extension, and noted in README.
2014-01-29 10:55:51 +01:00
Rachel Knowler
5e1aef1d08
Removed extra newline.
2014-01-29 10:06:32 +01:00
Rachel Knowler
7d71b0a00b
Wrap tag munging code in config option, defaulting to False.
2014-01-29 10:02:16 +01:00
amercader
2b803a3f66
[ #77 ] Use auth_allow_anonymous_access decorator
...
Starting from 2.2 you need to explicitly flag auth functions that
allow anonymous access with the p.toolkit.auth_allow_anonymous_access
decorator. A local version of the decorator is used to ensure we only
use it on CKAN>=2.2
2014-01-20 13:47:37 +00:00
amercader
4cc56f51ab
[ #76 ] Use harvest_source_show on reindex command
2014-01-14 17:04:34 +00:00
amercader
95d0ef0f01
[ #76 ] Add extra fields to the source schema
...
Add 'private' and its core validators, and 'metadata_modified' and
'metadata_created'.
Also ignore '__extras'
2014-01-14 17:01:25 +00:00
amercader
467fb7bb8f
Fix resource updating for harvested datasets
...
Starting from 2.2, resource_update calls package_show before updating
the resource via a package_update call. The dict passed had the harvest
extras (eg harvest_object_id) added which made the update call fails due
to duplicated extra keys. To fix it we now remove any harvest extras
on after_show if there is a 'for_edit' property on the context.
2014-01-13 10:30:52 +00:00
amercader
278a8e1ada
Merge branch 'master' of github.com:okfn/ckanext-harvest
2014-01-10 13:49:38 +00:00
amercader
1e94a11255
[ #70 ] Fix Add harvest source button not showing
...
Due to changes in the templates starting on 2.1 the add source button
was not showing. The whole search template has been simplified,
separating in a separate file the 2.0 only code.
Tested in 2.0, 2.1 and 2.2
2014-01-10 13:48:02 +00:00
Mikko Koho
51e842ee6e
Add quotes to harvest_source_id in Solr query when clearing harvest sources
2013-11-22 11:01:26 +02:00
Mikko Koho
c338452872
Put harvest_source_id in quotes to prevent Solr errors
2013-11-21 14:14:56 +02:00
amercader
928ea061aa
Improve organizations dropdown on source form
2013-10-24 12:33:44 +01:00
Stefan Oderbolz
c52085006a
[ #61 ] Truly ignore harvest sources
...
The currently implementation returns False when a harvest source is being harvested. This leads to an error on the harvesting job, which in turn tends to confuse users that have no idea of this special implementation. This fix ensures that harvest sources are still ignored, but silently.
2013-10-23 07:40:55 +02:00
amercader
c18d9dc3af
[ #71 ] CKAN harvester: Add datasets to source organization
...
If the harvest source belongs to an organization, new datasets should be added
to it. This is already the case in the spatial harvesters.
The remote orgs logic has been kept, with the difference that if for
some reason the remote org can not be assigned, the local one is used.
If the source does not have an organization, none is added.
2013-10-22 16:24:43 +01:00
amercader
380c14c22c
Fix CLI sources list output
2013-10-16 13:03:32 +01:00
amercader
55d2b4e304
Fix purge command
2013-10-16 12:59:23 +01:00
amercader
bd62b62764
Merge branch 'metaodi-add-harvesting-of-organizations'
2013-10-15 17:50:04 +01:00
amercader
49999893e7
Merge branch 'add-harvesting-of-organizations' of git://github.com/metaodi/ckanext-harvest into metaodi-add-harvesting-of-organizations
2013-10-15 17:49:28 +01:00
Stefan Oderbolz
8b5d70c6fe
Only try to create/match a organization if there is a remote_org
2013-10-11 18:08:32 +02:00
amercader
0f5624822c
Use remote name if present when creating datasets on CKAN harvester
2013-10-11 16:50:25 +01:00
amercader
340e9eed63
Merge branch 'add-harvesting-of-organizations' of git://github.com/metaodi/ckanext-harvest into metaodi-add-harvesting-of-organizations
2013-10-11 16:14:18 +01:00
amercader
e9dde3f48a
Only show the 'Add Harvest Source' button if user is authorized
2013-10-11 11:55:14 +01:00
Stefan Oderbolz
dd1acd0c6b
Use remote_orgs for organizations
2013-10-07 11:22:19 +02:00
amercader
f89f12203c
Merge branch 'fix/rename-ampq-to-amqp' of git://github.com/opendatatrentino/ckanext-harvest into opendatatrentino-fix/rename-ampq-to-amqp
2013-10-04 17:24:53 +01:00
amercader
c5f4d6889b
Merge branch 'harvest-object-create'
2013-10-04 17:22:58 +01:00
Stefan Oderbolz
d50eb6fca8
Harvesting of remote organisations similar to remote groups
2013-10-04 16:37:52 +02:00
joetsoi
da2fd45e80
[ #65 ] make harvest_job_exists validator return model object
...
return the model in the validator instead of checking that it exists in
the validator, returning the id and then fetching it again in the action
function
2013-10-03 15:51:37 +01:00
Samuele Santi
611b9aab6d
Fixed typo: ampq -> amqp
2013-09-19 11:43:03 +02:00
joetsoi
9b3199b41b
[ #65 ] remove unused code
2013-09-17 17:02:38 +01:00
joetsoi
5da153c6b6
[ #65 ] harvest_object_create action
...
update to use schema and validators. Also accept more parameters to
data_dict.
2013-09-17 16:49:19 +01:00
John Martin
71aedf3fd4
Makes ede45bd
work in both CKAN 2.0 and 2.1
2013-09-05 15:36:00 +01:00
John Martin
ede45bd1be
Fixes #66 by correcting the block name
2013-09-05 15:31:44 +01:00
joetsoi
1b663bbff4
add harvest_object_create action
2013-09-04 14:17:01 +01:00
amercader
52956feab9
Merge branch '62-default-package-name'
2013-08-19 18:23:37 +01:00
amercader
f51b8e905a
[ #58 ] Check properly for version numbers (patch numbers)
2013-08-19 18:13:01 +01:00
Vitor Baptista
f028375ad3
[ #62 ] Use current name when updating package, if the user haven't sent a new one
...
It's hard for someone outside CKAN to make sure they're sending it in the format
we expect. And they'll also have to keep track of our name format, to keep in
sync whenever we change.
To fix this, we simply do what we already do when creating packages: use a
default name. In this case, the current one.
2013-08-18 12:08:30 -03:00
John Martin
86dcd933ea
Merged master
2013-08-15 18:47:16 +01:00
John Martin
712e150b52
[ #58 ] Fix to make merge nice
2013-08-15 18:43:46 +01:00
John Martin
575df637b4
[ #58 ] Fixes to make harvest templates to work with both CKAN 2.0 and 2.1
2013-08-15 16:45:02 +01:00
amercader
05e6362c38
Merge branch 'fix-jinja-status-exception' of git://github.com/metaodi/ckanext-harvest into metaodi-fix-jinja-status-exception
2013-08-15 14:39:20 +01:00
amercader
01ca5c0dfd
[ #61 ] Ignore harvest sources on the CKAN harvester
2013-08-15 14:38:33 +01:00
amercader
b25fffda93
[ #36 ] Fix bug on API version checking
2013-08-15 14:37:55 +01:00
amercader
39ad78d90a
[ #59 ] Ignore auth in the CKAN harvester
2013-08-15 14:37:12 +01:00
Stefan Oderbolz
f26baf6c09
Hide both the label and the number of datasets when 'status' is not available
2013-08-15 13:25:16 +02:00
amercader
1c36b33aaf
[ #59 ] Ignore auth when using site_user
2013-08-14 12:28:27 +01:00
amercader
ffea49ca62
[ #56 ] Update parameters on source create command
...
Add missing title and owner_org fields, remove deprecated user_id and
publisher_id
2013-08-14 11:54:51 +01:00
amercader
3494727d3f
[ #56 ] Increase max params number
2013-08-14 11:43:32 +01:00
amercader
8e33262026
[ #56 ] Fix syntax error and wrong type
2013-08-14 11:31:23 +01:00
Stefan Oderbolz
4dfd091aec
Make the /harvest page more robust if source.status is not set
...
This prevents exceptions from appearing in the log from Jinja:
[error] [client 1.2.3.4] Error - <class 'jinja2.exceptions.UndefinedError'>: 'dict object' has no attribute 'status'
2013-08-14 11:52:11 +02:00
Stefan Oderbolz
7ae9d6e208
Made print method more robust against KeyErrors
...
This is especially needed if you create a new harvest source which does not have all the optional arguments. Before this lead to a KeyError after the creation of the source. Now this simply output 'None'.
2013-08-05 23:50:30 +02:00
Stefan Oderbolz
1249564be5
Add additional name argument when creating new harvest source
2013-08-05 23:46:21 +02:00
Stefan Oderbolz
ade5f83e38
Change key of data_dict from 'type' to 'source_type'
2013-08-05 23:07:25 +02:00
amercader
cb745c3c3e
Avoid importing unnecessary functions from the harvest logic
2013-08-05 18:39:44 +01:00
Vitor Baptista
70e53a7833
Fix bug where source was being treated as an object, when it's a dict
2013-07-29 07:06:58 -03:00
amercader
cc3f3d3426
[ #50 ] Fix objects deletion on gather exceptions
2013-07-05 13:29:11 +01:00
amercader
e2696b98bb
[ #50 ] Save all dates as UTC in the database
...
At some point we may want to transform these to local time at the
dictization level. We will need a library like dateutil to handle it
properly though.
2013-07-04 14:59:27 +01:00
kindly
c2283e3fdb
only migrate harvest sources which are active
2013-06-28 02:32:45 +01:00
kindly
a42991b8c9
fix so that non sysadmins can edit harvest sources of organizations they
...
are admins or editors of.
2013-06-27 12:16:11 +01:00
kindly
6540726c47
use correct limit for paging harvest listing
2013-06-26 11:14:38 +01:00
amercader
584c340583
Merge branch '42-remove-non-string-extras'
2013-06-03 10:33:59 +01:00
Sean Hammond
01df3a1db4
[ #42 ] Dump non-string extras with json
...
Convert any non-string extra values to strings using json.dumps(),
instead of just deleting them.
2013-05-31 20:35:06 +02:00
amercader
3a31db59b6
[ #36 ] Move validation code to validate_config
...
This ensures it is checked whenever the source is edited or created.
2013-05-31 17:23:40 +01:00
amercader
a6a0196a4e
Merge branch 'api-version-fix' of git://github.com/fraunhoferfokus/ckanext-harvest into fraunhoferfokus-api-version-fix
2013-05-31 17:15:43 +01:00
Sean Hammond
85a013f2c9
[ #42 ] Remove non-string extras from packages
...
Remove extras whose values are not strings (e.g. dicts, lists..) from
packages before attempting to create or update the packages on the
target site.
In CKAN 1 it was possible for the values of extras to be other types,
but in CKAN 2 they must be strings, so when harvesting from a CKAN 1 site
into a CKAN 2 site SQLAlchemy would crash when trying to create packages
with non-string extras.
The fix in this commit is to simply remove any non-string extras from
the harvested package. (Alternatively, we could try to convert them to a
string using JSON.)
Fixes #42 .
2013-05-31 15:43:42 +02:00
amercader
361abcfc07
[ #17 ] Fix bug with remote groups handling
...
If neither 'only_local' or 'create' are used the remote groups property
needs to be removed, otherwise it causes an exception when the group is
not found.
2013-05-30 18:06:15 +01:00
Konrad Reiche
87cae31c75
Fix api_version check in the group importer code
...
I have forgotten to update one check for the api_version 1 in the code
responsible for the remote group import feature. This commit fixes that.
Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:36:56 +02:00
Konrad Reiche
c858b9fe9f
Add exception handling for the API version parsing
...
I have added try-except clauses in order to prevent the process from
crashing if a non-parsable integer is used for the api_version option.
Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 13:12:05 +02:00
Konrad Reiche
05094090af
Change type of the API version to integer
...
The CKAN logic uses integers when dealing with the API version, e.g.
making checks which API version is in use. Currently, the harvester
uses strings to identify the API version. Instead of dealing with
type conversion the harvester could use integers directly.
This commit fixes okfn/ckanext-harvest#36 . When the API version is
parsed from the configuration it is passed through the int() function.
This way the harvesting will still work even if a harvest source was
configured with a string API version which makes this commit backward
compatible.
Signed-off-by: Konrad Reiche <konrad.reiche@fokus.fraunhofer.de>
2013-05-27 12:51:48 +02:00
amercader
ff7287d4b4
[ #30 ] Remove lxml dependency
2013-05-24 18:12:02 +01:00
amercader
3d2867ca04
[ #17 ] Remove ckanclient dependency as it is not used
2013-05-24 17:55:37 +01:00
amercader
f1d11c1307
[ #17 ] Import remote groups in CKAN harvester
...
This is a cleaner commit of the great work done by @platzhirsch
implementing remote groups import on the CKAN harvester.
2013-05-24 16:55:05 +01:00
amercader
1792180e4f
Better harvest source dataset migration
...
Current implementation only checked for the first source to exist and
didn't allow to rerun the migration for other sources if there was an
error. With the new one, all non existing sources are migrated each
time.
2013-05-24 14:49:55 +01:00
amercader
1d54edfdaa
Fix bug in source datasets migration
...
Wrong dataset type was causing the default package schema to be used,
thus failing when providing an id.
2013-05-24 14:25:05 +01:00
amercader
751409ab7d
[ #34 ] Integrate clear command with delete source
...
When deleting a source, if clear_source equals true in the context,
harvest_source_clear will be called. Default is false. The UI shows a
select with the two options.
2013-05-20 14:30:22 +01:00
amercader
6d5d0fbaae
Add hover helper text to refresh and clear buttons
2013-05-20 12:09:14 +01:00
Tom Rees
edfc49719b
Use page_heading helper consistently with the main CKAN templates.
2013-05-17 16:12:57 +01:00
amercader
d0bc52f2d8
[ #34 ] Fix typo in warning message
2013-05-16 18:07:32 +01:00
amercader
b9e2613458
[ #34 ] Allow all authorized users for a source to clear it
2013-05-16 17:57:59 +01:00
amercader
71349e658b
[ #34 ] Expose harvest source clear button
2013-05-16 17:51:48 +01:00
amercader
7b652542e7
[ #34 ] Fix harvest_source_clear action
...
Some typos in the SQL statements, and also the source needs to be
reindexed to update the status with the counts.
2013-05-16 17:33:39 +01:00
amercader
1efd7ab4cd
Ignore remote orgs in CKAN harvester
...
If #17 progresses we can do somethign similar for them, although it amy
be more complicated because of authorization issues.
2013-05-16 17:30:54 +01:00
kindly
1714e55110
simplify harvest_clear queries so they do not lock on big db
2013-04-30 13:59:23 +01:00
kindly
a2b8ab1994
make harvest source clear not create table
2013-04-30 12:40:46 +01:00
amercader
9041f3f3ad
Changes in Redis conusmer to make tests work
2013-04-22 18:08:19 +01:00
amercader
70dfee1a36
Update queue tests
2013-04-22 17:56:11 +01:00
kindly
dcfd201cdd
[ #32 ] redis queue support
2013-04-21 17:04:57 +01:00
kindly
0ce59a29b6
delete insead of update harvest objects when error
2013-04-12 12:32:33 +01:00
kindly
7d7657f94a
make gather phase as finished if there is an error
2013-04-12 10:35:08 +01:00
kindly
bd761498f0
make sure config dict is not jsonified if it contains an error
2013-04-08 18:52:36 +01:00
amercader
eaebeb4e6e
Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0
2013-04-08 13:25:33 +01:00
amercader
5414b6c08d
Merge branch '29-new-idataset-form' into release-v2.0
2013-04-08 13:23:41 +01:00
joetsoi
66ff773f99
remove previous commit import
2013-03-29 12:47:14 +00:00
joetsoi
3ac065f0f0
fix package_schema import
2013-03-29 01:17:24 +00:00
joetsoi
cb8b808274
sanity check that harvest source id matches harvest dataset id
...
remove author_email, license_id, maintainer_email, maintainer and
author from pacakge_dict, these were not actually necessary
2013-03-29 00:59:20 +00:00
amercader
99bd17401c
Handle wrong JSON in harvest_source_extra_validator
2013-03-28 16:19:16 +00:00
kindly
a9b8be8f01
harvest source index clear
2013-03-28 15:36:44 +00:00
amercader
95ebb5bbf3
[ #29 ] Remove check_data_dict ✨
2013-03-28 15:01:21 +00:00
amercader
fbc8ecde97
[ #29 ] Fix some imports on actions and plugin
2013-03-28 15:00:44 +00:00
kindly
c754479014
#29 make new idatasets form work with harvest source form
2013-03-25 17:38:07 +00:00
joetsoi
548d3c1c2a
fix validation issue on db upgrade
2013-03-25 12:02:07 +00:00
kindly
b5a697ec87
Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0
2013-03-25 11:58:31 +00:00
kindly
0b5c3c608a
catch and raise gather exception, acking the message
2013-03-25 11:57:57 +00:00
amercader
438ba672e2
Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0
2013-03-25 11:44:37 +00:00
kindly
845c9927a8
add harvest source clear
2013-03-25 11:39:00 +00:00
joetsoi
d518b6709a
[ #27 ] fix package_list_for_source for unowned data sources
2013-03-21 15:59:22 +00:00
amercader
7bff041568
[ #25 ] Further tweaks on helpers texts
2013-03-21 13:47:23 +00:00
John Martin
4d0dd9a4d3
[ #25 ] Small copy tweak to confirmation dialog
2013-03-21 12:14:33 +00:00
John Martin
78bde974b9
[ #25 ] Adds confirmation dialog to reharvest button
2013-03-21 10:56:39 +00:00
John Martin
3197162fe6
[ #25 ] Changed 'Refresh' to 'Reharvest' on button
2013-03-21 10:36:12 +00:00
kindly
a7583a7b8b
Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0
2013-03-21 02:32:11 +00:00
kindly
b676fb02e1
only get out harvest items in interface and when indexing
2013-03-21 02:31:34 +00:00
amercader
15c44d9aa8
Merge branch '23-harvest-form-cleanup' into release-v2.0
2013-03-20 17:03:41 +00:00
John Martin
4ba298fe58
[ #23 ] Make labels a little wider on harvest new form
2013-03-20 14:07:03 +00:00
amercader
02e90767f4
Fix source listing in organization page
...
It needed update after #515 in ckan core
2013-03-20 13:01:23 +00:00
John Martin
86355fb9db
[ #23 ] Form cleanup after core bootstrap upgrade
2013-03-20 10:44:24 +00:00
kindly
634a0bbd30
return instead of continue
2013-03-19 01:21:20 +00:00
kindly
3adf38105e
readd code from old branch seperating the fetch and import logic
2013-03-19 01:16:43 +00:00
amercader
c2a6bd14eb
Add auth function for harvest_source_show_status
2013-03-18 16:48:27 +00:00
amercader
c76b7d95f3
Only count public datasets on the source status
...
This is more in line with what is done on the orgs/groups pages
2013-03-18 16:41:01 +00:00
amercader
cb80ac784e
Add logic to show private datasets to authorizaded users
2013-03-18 16:29:29 +00:00
amercader
341331ac53
Merge branch 'release-v2.0' of github.com:okfn/ckanext-harvest into release-v2.0
2013-03-14 17:33:21 +00:00
amercader
d77f16aba9
[ #21 ] Improve gather stage error handling
...
See issue for full details. Basically we don't want to catch any
exception at the queue.py level, as they prevent debugging. Harvesters
should deal with them and return a list of ids or an empty list if no
objects need to be fetched.
Also improved the debug messages.
2013-03-14 17:31:07 +00:00
John Martin
b30cc54427
Fix for add harvest source button within org
2013-03-14 14:45:54 +00:00
amercader
91f18bffab
Fix pagination on org sources listing
2013-03-14 11:44:38 +00:00
amercader
8cac0977aa
Fix import on org sources listing
2013-03-14 11:44:22 +00:00
amercader
cd6c1b56a8
[ #18 ] Get package dict on after_delete to check type
...
No need for #615 in core then
2013-03-13 17:31:39 +00:00
amercader
1b11b00946
[ #18 ] Fix wrong logic for setting the source active field
2013-03-13 13:19:43 +00:00
kindly
cb5e06119e
Merge branch 'release-v2.0' of https://github.com/okfn/ckanext-harvest into release-v2.0
2013-03-12 23:31:58 +00:00
kindly
06355ee6c4
Make IFacets work for harvest source related searches
2013-03-12 23:31:06 +00:00
amercader
fab5b81c2c
Pass context to functions handling harvest sources
2013-03-12 17:30:31 +00:00
amercader
5e50a5c9ad
[ #8 ] Update how state is handled for source objects
2013-03-12 15:35:49 +00:00
amercader
2ee3f33f51
[ #18 ] Allow reactivation of sources
...
Due to #607 in CKAN core, once a source was deleted you could not
reactivate it again. As a workaround, if the source is deleted the
Delete button is not shown and the state select is, so you can set it to
'active'.
Also fixed wrong redirect after deletion.
2013-03-12 14:06:54 +00:00
amercader
23d1d5742c
[ #18 ] Update delete harvest source functionality
...
The harvest_source_delete logic function proxies to package delete,
which will delete the harvest source dataset. The harvest plugin then
hooks to the after_delete extension point in order to inactivate the
actual HarvestSource object and abort any pending jobs.
Also added the Delete button to the harvest source form.
2013-03-12 13:14:07 +00:00
amercader
c957fdf17c
Merge branch '14-template-tweaks' into release-v2.0
2013-03-08 14:49:43 +00:00
amercader
ecceff48ed
[ #14 ] Use source.organization again after fix in 949bb6f
2013-03-08 14:48:49 +00:00
amercader
949bb6fe6a
[ #16 ] Add organization to source dict
2013-03-08 14:47:11 +00:00
John Martin
f25ef19985
[ #14 ] Fix for org breadcrumbs on sources
2013-03-08 12:48:11 +00:00
John Martin
2a53e4a2e4
[ #14 ] Couple of minor template tweaks
2013-03-08 12:38:41 +00:00
joetsoi
7257258ca4
mark new harvest objects as current
...
When a new harvest_object for a new package was being created, it
was immediately being marked as false, as all objects were marked
as false, including the new object just created and newly marked
as current=true.
Fix so that old HarvestObjects are only marked as current=False
when updating an existing package.
2013-03-07 20:27:27 +00:00
John Martin
14e51ec587
Fix for removed snippet from ckan core
2013-03-07 11:52:59 +00:00
amercader
2ee27164c3
[ #13 ] Remove or deprecate unused code
...
Mostly in controllers, dictization and plugin, either related to the old
templates pre-dataset type or old authorization.
2013-03-06 16:54:33 +00:00
amercader
6c02c87f8d
[ #13 ] Set routes to /harvest
...
Mostly painless as we (most of the time) were using DATASET_TYPE_NAME.
All old routes now point to the correct place in the new interface.
2013-03-06 16:33:46 +00:00
amercader
eda280f266
Merge branch '12-org-source-listing' into 2.0-dataset-sources
2013-03-06 15:45:45 +00:00
amercader
889325dd9c
[ #12 ] Clean up and rename organization controller
2013-03-06 15:43:10 +00:00
amercader
e9adaa7f91
[ #12 ] Change URL for org sources list
...
Use "/organization/harvest_source/{id}", which will turn into
"/organization/harvest/{id}" soon
2013-03-06 15:38:38 +00:00
amercader
74633d0803
Fix error count in job stats
...
We want to take into account objects with errors that where created or
updated anyway (eg bbox errors), so we bascially query for the number of
objects that have object errors.
Also add the number of gather errors to this count.
2013-03-06 13:44:04 +00:00
amercader
ef2defbcf9
[ #7 ] Refactor job report page to include all errors
2013-03-06 13:43:40 +00:00
amercader
bec31a611e
Fix empty job finished date
2013-03-06 13:42:35 +00:00
amercader
04710fd1c6
Revert removal of filter in job list action in 7544d5c
2013-03-06 12:19:20 +00:00
John Martin
c2b552b980
[ #12 ] Better faceting for specifically harvest sources
2013-03-06 11:38:24 +00:00
John Martin
246898049e
[ #12 ] When harvest source listing is within org links goto edit pages
2013-03-06 11:36:24 +00:00
John Martin
9d149e4e5d
[ #12 ] Makes a harvest source admin page within org look a little nicer
2013-03-06 11:23:36 +00:00
kindly
ca2df234d2
[ #12 ] begin work on org harvest source controller
2013-03-06 04:11:31 +00:00
kindly
23aa45cc71
Merge branch '2.0-dataset-sources' into source_extra_config_validation
2013-03-06 01:10:48 +00:00
amercader
d9a71f7c59
[ #7 ] Fix wrong finish date on job listing
2013-03-05 18:56:30 +00:00
John Martin
e566c96d62
[ #7 ] Adds new harvest source button
2013-03-05 16:06:04 +00:00
John Martin
7544d5c5ef
[ #7 ] Removed faceted navigation for uneeded toggles in job reports
2013-03-05 15:23:42 +00:00
joetsoi
e64c8ead0f
fix print gather_errors
2013-03-05 12:49:20 +00:00
amercader
574c69fa9c
Merge branch '2.0-dataset-sources' into 7-harvest-source-templates
2013-03-01 17:55:16 +00:00
amercader
182fbf054a
Add XML declaration to contents if not present
2013-03-01 17:25:35 +00:00
amercader
5c17a525c1
Refresh session after each harvest stage
...
Otherwise the eg the source config got cached and you needed to restart
the consumers to refresh it.
2013-03-01 12:55:59 +00:00
amercader
bd128ab58b
Refresh session after each harvest stage
...
Otherwise the eg the source config got cached and you needed to restart
the consumers to refresh it.
2013-03-01 12:52:58 +00:00
amercader
3b6468b181
Merge branch '2.0-dataset-sources' of github.com:okfn/ckanext-harvest into 2.0-dataset-sources
2013-03-01 12:51:17 +00:00
joetsoi
9432368bea
fix gather_stage if there is a previous job
...
change check on gather stage to check for changed packages since
last job instead of current harvest job's gather_start
fix attribute look up bug
fix print_job to print 0 gather_errors instead of key error
2013-02-28 19:06:21 +00:00
joetsoi
ffce2c7915
Merge branch '2.0-dataset-sources' of github.com:okfn/ckanext-harvest into 2.0-dataset-sources
2013-02-28 18:11:12 +00:00
amercader
217d58d3a4
Merge branch 'source_extra_config_validation' of github.com:okfn/ckanext-harvest into source_extra_config_validation
2013-02-28 16:03:27 +00:00
amercader
f28dc97f79
Fix bug in harvest job reports
2013-02-28 15:47:56 +00:00
amercader
dab98112dc
Fix bug in harvest job reports
2013-02-28 15:47:35 +00:00
kindly
871576f89c
Merge remote-tracking branch 'remotes/origin/source_extra_config_validation' into source_extra_config_validation
2013-02-28 13:48:58 +00:00
kindly
9cef777e7b
make sure config is also on top level
2013-02-28 13:46:16 +00:00
amercader
e82410724a
Merge branch '7-harvest-source-templates' into source_extra_config_validation
2013-02-28 12:18:09 +00:00
amercader
f7cba69fe6
Merge branch '2.0-dataset-sources' into 7-harvest-source-templates
2013-02-28 12:17:47 +00:00
amercader
a86d91c3f0
[ #11 ] Make get actions side_effect_free
2013-02-28 12:17:15 +00:00
amercader
fe6952ed00
Merge branch '7-harvest-source-templates' into source_extra_config_validation
2013-02-27 15:45:33 +00:00
joetsoi
ba486a9482
add indexing of datasets whilst harvesting
2013-02-27 11:34:09 +00:00
John Martin
d1b2b158b2
[ #7 ] Harvest listing page and HTML/CSS cleanup
...
* I'm happy with /harvest_source/ now
* Also I've removed a load of undeeded CSS that wasn't really needed
* Also templates are now using core styles instead of custom ones
2013-02-27 11:14:04 +00:00
kindly
e0a3eb7899
add javascript for source type
2013-02-25 18:12:47 +00:00
kindly
5b50126670
source extras field type
2013-02-25 18:07:34 +00:00
amercader
efe977512b
Include gather errors on job summaries and reports
2013-02-25 17:17:08 +00:00
amercader
d1b71308af
[ #7 ] Minor tweaks in job pages
2013-02-25 16:15:37 +00:00
amercader
c7bb897cdd
[ #7 ] Inactivate Refresh button if a new job alredy exists
2013-02-25 15:33:29 +00:00
amercader
57b3739dd4
[ #7 ] Return most recent job on source status, not just finished
2013-02-25 15:32:39 +00:00
amercader
60f9360e84
[ #7 ] Don't show job snippet in dashboard if no jobs
2013-02-25 13:11:08 +00:00
amercader
93e15dc529
[ #7 ] Restrict access to source admin page
2013-02-25 13:10:30 +00:00
amercader
457b8d5988
[ #7 ] 404 on last job if no jobs yet
2013-02-25 12:49:14 +00:00
amercader
34ae6be689
[ #7 ] Fix dataset count on source page
2013-02-25 12:19:09 +00:00
amercader
b3819e8df4
[ #7 ] Use dict instead of domain object in templates
2013-02-25 12:18:30 +00:00
amercader
49a1c467cf
Merge branch '7-harvest-source-templates' of github.com:okfn/ckanext-harvest into 7-harvest-source-templates
2013-02-25 12:04:34 +00:00
amercader
e1d73c82f0
[ #7 ] Make new routes more custom
...
In case we change the root name
2013-02-25 12:03:34 +00:00
kindly
ebe246fe99
make report emit added so shows up on front end
2013-02-22 17:32:33 +00:00
amercader
57d6b3de74
[ #7 ] Fix auth check on new source form
...
Auth check failed because source was undefined
2013-02-22 17:32:05 +00:00
kindly
52c0a5cbd6
Merge branch '2.0-dataset-sources' into 7-harvest-source-templates
2013-02-22 17:26:34 +00:00
joetsoi
f97e3b4c6c
add return True to import stage of ckanharvester
...
Was causing queue.py to report that the import had errored.
2013-02-22 10:13:36 +00:00
amercader
83f8cf69a6
Remove unnecessary extra quotes (see #381 on CKAN core)
2013-02-19 11:51:22 +00:00
John Martin
28e589ee92
[ #7 ] Updates to the edit/new harvest source form
2013-02-12 16:29:07 +00:00
amercader
177349fd76
Update HarvesterBase
...
This is a convenience class that other harvesters can extend. Updates
include a cleanup of old functions and porting of enhancements from the
spatial harvesters.
2013-02-12 16:10:13 +00:00
John Martin
891f247181
[ #7 ] Small template tweaks to job pages
2013-02-12 15:49:06 +00:00
amercader
eaa8988440
[ #4 ] Changes in schema to accommodate organizations
...
Basically handle the 'owner_org' field in form_to_db and db_to_form.
Added 'owner_org', 'frequency' (has default) and 'config' to surplus
keys in check_data_dict.
Also remove schema tweaks to let package_show call the appropiate schema
function.
2013-02-11 16:34:52 +00:00
John Martin
bdc8206e8b
[ #7 ] Harvest job pages UX are complete
2013-02-08 17:19:04 +00:00
John Martin
7209723856
[ #7 ] Admin templates now are in the correct places
2013-02-08 13:52:48 +00:00
John Martin
0aa1c1fcbc
[ #7 ] Re-jigged harvest source read pages
2013-02-08 12:15:14 +00:00
amercader
3c50a40a76
[ #5 ] Fix auth for harvest_job_list (should forward to harvest_source_update)
2013-02-05 16:41:29 +00:00
amercader
413ef8786c
[ #5 ] Fix counts on jobs listing
2013-02-05 16:40:22 +00:00
amercader
5956e5a9d5
Merge branch '4-new-auth-for-2.0' into 5-improve-job-errors-reporting
2013-02-05 12:36:26 +00:00
amercader
ca7819b885
Merge branch 'release-v2.0' into 2.0-dataset-sources
2013-02-05 12:35:14 +00:00
amercader
cca554c5ec
Fix typo and add missing column on v3 migration script
2013-02-05 12:33:56 +00:00
amercader
e1ce0b7267
[ #5 ] Allow not returning error summary on job dictize
2013-02-04 18:28:45 +00:00
amercader
8576ad6784
[ #5 ] Add job listing page
2013-02-04 18:20:58 +00:00
amercader
22389fc52a
[ #5 ] Update report templates
...
The job details page has been updated to show the full error report, and
the whole report page has been dropped. All job details are loaded via a
snippet, which is also loaded on the harvest source page.
The frontend is still completely provisional.
2013-02-01 18:32:41 +00:00
amercader
42bace3628
[ #5 ] Add new finished field for harvest job
...
When the run command flags a job as finished, it will query the most
recent harvest object for this job and use its import_finished value as
the job finishing time.
2013-01-28 17:19:28 +00:00
amercader
920f07cdf7
[ #5 ] Cleanup the job controller actions
2013-01-28 16:32:53 +00:00
amercader
c8e7086567
[ #5 ] Change default auth for showing and listing jobs
...
Forward auth checks to harvest_source_update instead of
harvest_source_show, as job reports should only be visible to users that
can manage sources.
2013-01-28 16:31:11 +00:00
amercader
ab78bf21b9
[ #5 ] Fix typo in delete auth function
2013-01-28 16:15:38 +00:00
amercader
8431182f01
Document method and cleanup the interface file
2013-01-24 18:39:19 +00:00
amercader
676c7d34b6
[ #5 ] Add method for returning the original URL for a document
...
Harvesters implementing IHarvester can define a `get_original_url`
method that should return a URL pointing to the original location of a
document in the remote server. If present, this URL will be used on the
job reports.
Examples:
* For a CKAN record: http://{ckan-instance}/api/rest/{guid}
* For a WAF record: http://{waf-root}/{file-name}
* For a CSW record: http://{csw-server}/?Request=GetElementById&Id={guid}& ...
2013-01-24 18:35:43 +00:00
amercader
d4b6dcb7f6
[ #5 ] Add helper function for generating a link to a harvest object
2013-01-24 18:21:05 +00:00
amercader
daa9a385ff
Update job keys changed on 9ba6e8f
2013-01-24 17:36:58 +00:00
amercader
30d58b2b7b
[ #5 ] Preliminary job report logic function and page (WIP)
2013-01-23 18:04:19 +00:00
amercader
234f9f4cc0
[ #5 ] Add job summary page
...
Shows dataset and error counts, job details and a summary of the more
frequent errors.
2013-01-23 17:33:44 +00:00
amercader
b2b89dfd61
Add command for reindex all harvest sources
2013-01-22 16:43:36 +00:00
amercader
0d79252a09
Add command for reindex all harvest sources
2013-01-22 16:43:25 +00:00
amercader
6c861afe39
Update template with new harvest source status
2013-01-22 16:37:31 +00:00
amercader
9ba6e8f3b3
[ #5 ] Add error summary to harvest_job_dictize
...
It will return the counts for the 20 most common errors for that
particular job. These will available when calling harvest_job_show.
Also refactor the harvest source status object to just call
harvest_job_dictize on the 'last_job' key, as it has all the
interesting fields anyway.
2013-01-22 13:13:24 +00:00
amercader
30c9eedf5f
Improve harvest source status creation
...
Use report_status field to improve speed, remove unnecessary fields.
2013-01-17 15:43:45 +00:00
amercader
bfce5185f0
[ #4 ] Add db_to_form_schema_options to harvest plugin to avoid validation on show
2013-01-16 17:45:33 +00:00
amercader
2ab10afcf9
[ #4 ] Fix typo in auth functions
2013-01-16 12:56:58 +00:00
amercader
2f4cd3a4b0
[ #4 ] Fix logic functions importer
2013-01-15 19:29:17 +00:00
amercader
2bb669af21
[ #4 ] Add owner_org field to schema and form
...
This should store the owner organization id.
Also added the errors box on the form.
2013-01-10 12:23:01 +00:00
kindly
acb17ff3b0
capture errors more cleanly
2013-01-10 10:48:48 +00:00
amercader
e49dd94b34
[ #4 ] Remove authorization functions for the publisher profile
...
The different profiles will be now configured via the harvest source
datasets on CKAN core, so they are no longer needed.
2013-01-09 17:35:47 +00:00
amercader
288e1429a6
[ #4 ] Remove the loading of different authorization profiles
...
The different profiles will be now configured via the harvest source
datasets on CKAN core, so it is no longer needed.
Also simplify IActions and IAuthFunction hook calls.
2013-01-09 17:32:05 +00:00
amercader
058dcad435
[ #4 ] Minor change on the state field to fix a bug on harvest_source_show
2013-01-09 17:31:30 +00:00
amercader
a866445023
[ #4 ] Refactor authorization functions
...
The authorization functions have been refactored to take into account
both the new organizaton based authorization on CKAN core and the
harvest source datasets.
Basically at the source level, authorization checks are forwarded to the
relevant package auth function (package_create, package_update, etc.)
wich will check for organizations membership, sysadmin, etc.
Also we only use functions available on the plugins toolkit whenever
possible.
2013-01-09 17:26:48 +00:00
amercader
1342463f8a
Merge branch '2.0-dataset-sources' into 4-new-auth-for-2.0
...
Conflicts:
ckanext/harvest/logic/action/get.py
2013-01-09 11:09:34 +00:00
amercader
6b23082010
Move logic from setup_template_variables to helper functions
2013-01-09 11:07:44 +00:00
kindly
7b6beb1470
fix wrong authorization logic
2012-12-24 22:34:37 +00:00
kindly
01dfda59b6
Merge branch 'release-v2.0' into 4-new-auth-for-2.0
2012-12-24 12:46:56 +00:00
kindly
36389e7ce0
make sure gather phase finishes job if there is a severe error
2012-12-24 12:21:21 +00:00
amercader
43950aa4ff
Merge branch 'release-v2.0' into 4-new-auth-for-2.0
...
Conflicts:
ckanext/harvest/logic/action/get.py
ckanext/harvest/tests/test_queue.py
2012-12-20 16:38:57 +00:00
amercader
fdac761fba
Merge branch 'release-v2.0' into 2.0-dataset-sources
...
Conflicts:
ckanext/harvest/logic/action/get.py
ckanext/harvest/tests/test_queue.py
2012-12-20 16:16:30 +00:00
amercader
19cd80b264
[ #4 ] Fixes on the auth layer against the new core auth
...
Thanks @locusf for the original patch
2012-12-20 16:09:26 +00:00
amercader
510e2d3725
Fix pager links in harvest source page
2012-12-19 17:27:05 +00:00
kindly
b940baacc0
make statistics use new report_field
2012-12-18 02:39:14 +00:00
kindly
6b42d96fe0
add report_status field
2012-12-17 23:50:26 +00:00
kindly
596b9bb475
fix auth to use new sysadmin flag
2012-12-17 23:46:43 +00:00
amercader
478326922b
Fix tests
...
* Adapt test_queue to harvest source datasets
* Don't use the same mock harvester on different datasets as it messes
the tests up
* Skip auth tests for the time being
2012-12-14 14:52:19 +00:00
amercader
6df525a377
Reindex the harvest source dataset after finishing jobs
...
This ensures that the status details shown on the harvest sources search
page is up to date (as it is loaded from the indexed data_dict)
2012-12-14 14:27:55 +00:00
amercader
c1b0415cb6
Merge branch 'release-v2.0' into 2.0-dataset-sources
...
Conflicts:
ckanext/harvest/model/__init__.py
2012-12-13 18:33:59 +00:00
amercader
d57e73458a
Make harvest object - package FK deferrable
...
Allows eg to add the harvest object id to the package dict before
indexing.
2012-12-13 18:21:40 +00:00
amercader
b424ba1cea
Add flag to avoid returning all objects when getting a job
2012-12-13 18:20:49 +00:00
amercader
0dde483992
Set job status to Finished when actually finishing it
...
Until now, harvest jobs were set to Finished just after sending all
objects to the fetch stage. Now every time the run command is run, jobs
are set to Running, and all previous Running jobs are checked to see if
all harvest objects have a state of Complete or Error. Only then the job
is flagged as Finished.
2012-12-13 18:19:22 +00:00
amercader
81c3881a1a
Add active field to source dict
2012-12-13 18:00:07 +00:00
amercader
37efb3b978
Set harvest object state depending on the output of import_stage
...
Either to COMPLETE or ERROR, depending on whether it returns True or
False.
2012-12-13 14:30:13 +00:00
amercader
4da64a84ae
Add more elements to the harvest sources page (still provisional)
2012-12-12 18:49:38 +00:00
amercader
e0f3d47cb9
Add extra information to the harvest source page
...
The status object gives extra information about the source and there is
a helper function to build the dataset list for this particular source.
TODO: Pager still needs fixing.
2012-12-12 11:54:50 +00:00
amercader
b567e562f4
Add after_show extension point
...
We hook into the package_show extension point in order to:
1. For harvest_source type datasets, add extra information about the
source, jobs, etc (calling harvest_source_show_status)
2. For normal datasets, check if they were harvested, and if so, add a
reference to the harvest object and harvest source.
2012-12-12 11:49:55 +00:00
amercader
2557636994
Update endpoints to receive the context object
2012-12-12 11:47:57 +00:00
amercader
8e1621731b
Move harvest source status function as a logic function
...
The status dict is added automatically to harvest source packages.
Note that the actual queries still need to be updated as they proabably
won't scale.
2012-12-12 11:45:13 +00:00
amercader
b0407bb2ac
Update harvest_source_show logic function
2012-12-11 12:49:05 +00:00
amercader
fcbe6aa6de
Script for creating harvest source datasets on old versions
...
The way we check whether datasets need to be created might need to be
improved.
2012-12-05 18:54:28 +00:00
amercader
22ec9cb5af
Fix old controller import
2012-12-05 18:53:35 +00:00
amercader
697933f8d0
Add custom harvest source read page (provisional)
2012-12-05 15:47:02 +00:00
amercader
2dba7fbf78
Add custom harvest sources search page
2012-12-05 14:51:20 +00:00
amercader
a605564a41
Fix links to harvest sources page
2012-12-05 13:01:56 +00:00
amercader
d77bf255b4
Finish up create and edit forms, including breadcrumbs, links, etc
2012-11-30 18:53:13 +00:00
amercader
9d83322591
Fix config validator and add tests
2012-11-30 17:02:06 +00:00
amercader
803b228d1c
Update harvest source create and update logic functions
...
`harvest_source_create` and `harvest_source_update` now call
`package_create` and `package_update` respectively, making sure to
define a 'harvest_source' type. The returned dict uses the db_to_form
schema.
2012-11-30 14:11:24 +00:00
amercader
0e0aed0503
Clean up schemas
...
Better naming, remove old ones, ignore __extras field
2012-11-30 13:20:37 +00:00
amercader
875a773f1c
Check if type property is actually there
2012-11-30 11:10:21 +00:00
amercader
7db09fceb0
Various fixes for the harvest source dataset type forms
...
Add a db to form schema to show the fields stored in extras. Validate
the source url on the Package object.
2012-11-29 16:57:20 +00:00
amercader
ab7a379058
Behind the scenes creation and updating of HarvestSource objects
...
Taking advantage of the new after_create/after_update extensions points,
the extension checks if the dataset type is harvest source and creates
or updates the corresponding HarvestSource object. When creating a new
one, it will use the same id as the dataset.
2012-11-29 16:48:44 +00:00
amercader
9d36fd6841
First stub of the new dataset type forms
...
Adds a 'harvest_source' dataset type that mimics the original harvest
source form.
It works against the 3022 branch on CKAN core.
2012-11-29 12:31:48 +00:00
amercader
866fd69730
Do not remove XML declaration and add utf-8 charset to headers
2012-11-20 15:43:39 +00:00
amercader
c52ed3b163
Add line field to object error table
2012-11-20 11:29:58 +00:00
amercader
03fd1884f4
Implement retry times for harvest objects
2012-11-15 18:11:35 +00:00
kindly
202c9d9fcc
use correct queue for gather stage
2012-11-15 14:21:09 +00:00
kindly
c9c1eb4848
use generator to consume
2012-11-15 14:14:55 +00:00
amercader
33d5e09722
Change fetch_callback to proper acknowledge objects
2012-11-15 11:36:06 +00:00
amercader
13357893ad
Fix typo
2012-11-13 14:41:38 +00:00
amercader
54ff0526bb
Return original document if present when requesting an object
2012-11-13 12:06:36 +00:00
amercader
820443d58f
Add cascade option to harvest object extras and errors
2012-11-09 14:52:34 +00:00
kindly
5063626554
make sure state is changed to error on fetch error
2012-11-07 09:53:16 +00:00
kindly
28e5e9137a
add perge queues command
2012-11-07 09:51:25 +00:00
kindly
6db65b5826
made manual default not null
2012-11-05 13:17:32 +00:00
amercader
fdf01c09f2
Fix wrong check for harvest sources
2012-11-01 14:12:45 +00:00
amercader
d598c0707b
Ignore frequency field on the frontend for the time being
2012-11-01 14:12:01 +00:00
amercader
d7f8c9165c
Merge branch 'model_upgrade' into release-v2.0
2012-10-30 18:07:24 +00:00
amercader
d502b925a6
Remove old deprecated tests and some whitespace
2012-10-30 18:07:05 +00:00
amercader
a136cbf202
Fix typos in migration script
2012-10-30 17:52:10 +00:00
amercader
61b99e8eff
Merge branch 'pika' into release-v2.0
2012-10-30 17:31:30 +00:00
amercader
82a498d9fc
Rename function to be implementation independent
2012-10-30 17:13:39 +00:00
kindly
2529a17304
add jobs at certain frequencies
2012-10-29 17:15:02 +00:00