111 lines
3.7 KiB
Plaintext
111 lines
3.7 KiB
Plaintext
First Import
|
|
============
|
|
The first import of the organizations should be performed using the sql script: first_import_grid_ac.sql
|
|
|
|
1) Download the last dump from https://www.grid.ac
|
|
2) Update the paths in the sql script
|
|
3) Launch the script
|
|
|
|
If you want to add missing ROR identifiers:
|
|
|
|
1) Download ror.json from https://figshare.com/collections/ROR_Data/4596503
|
|
2) Update the paths in prepare_grid_ror_update.pl and update_ror_ids.sql
|
|
3) Launch prepare_grid_ror_update.pl
|
|
4) Launch update_ror_ids.sql
|
|
|
|
NB: The grid.ac dump is richer then ror dump, Ror does not consider some fiels (city, lat, lng) and hierarchical relationships among the organizations.
|
|
If grid.ac will be DEPRACATED we'll start using the import from ror (a script is available: prepare_import_ror.pl)
|
|
|
|
|
|
General Description
|
|
===================
|
|
|
|
# Schema
|
|
|
|
Main table:
|
|
organizations
|
|
Tables for Multiple properties:
|
|
acronyms,
|
|
urls,
|
|
other_ids,
|
|
other_names
|
|
Tables for vocabularies:
|
|
countries,
|
|
languages,
|
|
id_types,
|
|
org_types,
|
|
relationships (ie: child, parent, merged_in, merges, ...)
|
|
Table for conflicts and duplicates:
|
|
oa_conflicts,
|
|
oa_duplicates
|
|
Specific Views for the UI:
|
|
organizations_view
|
|
organizations_simple_view
|
|
organizations_info_view
|
|
suggestions_info_by_country_view
|
|
oa_duplicates_view
|
|
conflict_groups_view
|
|
duplicate_groups_view
|
|
To manage authorizations:
|
|
users,
|
|
user_roles,
|
|
user_countries,
|
|
users_view (VIEW)
|
|
Other:
|
|
organizations_id_seq (SEQUENCE to generate new OpenOrg IDs),
|
|
org_index_search (for fulltext search),
|
|
tmp_dedup_events (to import new suggestion from DedupWF)
|
|
|
|
|
|
# User Roles
|
|
|
|
User:
|
|
He can work only on organizations of specific countries
|
|
He can edit metadata of approved organizations
|
|
He can manage duplicates
|
|
National Admin:
|
|
All the User rights
|
|
He can work only on organizations of specific countries
|
|
He can approve/register organizations
|
|
He can manage conflicts
|
|
He can approve users of his own countries
|
|
Super Admin:
|
|
All the National Admin rights, but for all countries
|
|
|
|
# Actions
|
|
|
|
1) Create a new org from scratch
|
|
The ID is a valid OpenOrgId (generated by the system)
|
|
The status is 'approved'
|
|
|
|
2) Approve a suggested org
|
|
ID: A new org is created with OpenOrg Id and status='approved'
|
|
Stasus of old organizazion: 'duplicate'
|
|
Add a new duplicate to the old Id (status = 'approved')
|
|
Copy the duplicates from old to new organizations (status will be 'suggested')
|
|
|
|
3) Approve a suggested duplicate
|
|
in oa_duplicates: reltype = 'is_similar'
|
|
in organization: the duplicated org will have status = 'duplicate'
|
|
4) Discard a suggested duplicate
|
|
in oa_duplicates: reltype = 'is_different'
|
|
|
|
5) Resolve a conflict using a subset of suggested conflicts (approve)
|
|
Generate a new org
|
|
New org status: 'approved'
|
|
Conflict reltype: 'is_similar'
|
|
Old orgs: 'hidden'
|
|
Rels new <-> old : 'merges'
|
|
Rels old <-> new : 'merged_in'
|
|
6) Resolve a conflict using a subset of suggested conflicts (discard)
|
|
Conflict reltype: 'is_different'
|
|
|
|
# Load of new suggestion using a Dedup Workflow
|
|
The dedup wf writes the suggestions on the tmp_dedup_events at the end it calls the method /import/dedupEvents
|
|
The previous suggestions (orgs, dups and conflicts) are deleted
|
|
The suggestions are moved from the temp table according to:
|
|
1) not(isOpenOrg(oa_original_id)) AND (oa_original_id = local_id OR isEmpty(local_id)) -> suggested org
|
|
2) not(isOpenOrg(oa_original_id)) AND (oa_original_id != local_id OR isEmpty(local_id)) -> duplicate of a suggested org
|
|
3) isOpenOrg(oa_original_id) AND (oa_original_id != local_id OR isEmpty(local_id)) -> duplicate of a existing openOrgs
|
|
4) Create a group using 'group_id', it should contain only OpenOrg Ids (using oa_original_id and local_id): each couple of the group is a conflict
|