From 66bde9453b0506f8036ff34af8be429e050463bf Mon Sep 17 00:00:00 2001 From: "michele.artini" Date: Tue, 2 Mar 2021 10:45:13 +0100 Subject: [PATCH] readme --- .../dnet-orgs-database-application/README.txt | 95 +++++++++++++++++++ 1 file changed, 95 insertions(+) diff --git a/apps/dnet-orgs-database-application/README.txt b/apps/dnet-orgs-database-application/README.txt index 3ff268d2..b7a663b3 100644 --- a/apps/dnet-orgs-database-application/README.txt +++ b/apps/dnet-orgs-database-application/README.txt @@ -1,3 +1,5 @@ +First Import +============ The first import of the organizations should be performed using the sql script: first_import_grid_ac.sql 1) Download the last dump from https://www.grid.ac @@ -13,3 +15,96 @@ If you want to add missing ROR identifiers: NB: The grid.ac dump is richer then ror dump, Ror does not consider some fiels (city, lat, lng) and hierarchical relationships among the organizations. If grid.ac will be DEPRACATED we'll start using the import from ror (a script is available: prepare_import_ror.pl) + + +General Description +=================== + +# Schema + +Main table: + organizations +Tables for Multiple properties: + acronyms, + urls, + other_ids, + other_names +Tables for vocabularies: + countries, + languages, + id_types, + org_types, + relationships (ie: child, parent, merged_in, merges, ...) +Table for conflicts and duplicates: + oa_conflicts, + oa_duplicates +Specific Views for the UI: + organizations_view + organizations_simple_view + organizations_info_view + suggestions_info_by_country_view + oa_duplicates_view + conflict_groups_view + duplicate_groups_view +To manage authorizations: + users, + user_roles, + user_countries, + users_view (VIEW) +Other: + organizations_id_seq (SEQUENCE to generate new OpenOrg IDs), + org_index_search (for fulltext search), + tmp_dedup_events (to import new suggestion from DedupWF) + + +# User Roles + +User: + He can work only on organizations of specific countries + He can edit metadata of approved organizations + He can manage duplicates +National Admin: + All the User rights + He can work only on organizations of specific countries + He can approve/register organizations + He can manage conflicts + He can approve users of his own countries +Super Admin: + All the National Admin rights, but for all countries + +# Actions + +1) Create a new org from scratch + The ID is a valid OpenOrgId (generated by the system) + The status is 'approved' + +2) Approve a suggested org + ID: A new org is created with OpenOrg Id and status='approved' + Stasus of old organizazion: 'duplicate' + Add a new duplicate to the old Id (status = 'approved') + Copy the duplicates from old to new organizations (status will be 'suggested') + +3) Approve a suggested duplicate + in oa_duplicates: reltype = 'is_similar' + in organization: the duplicated org will have status = 'duplicate' +4) Discard a suggested duplicate + in oa_duplicates: reltype = 'is_different' + +5) Resolve a conflict using a subset of suggested conflicts (approve) + Generate a new org + New org status: 'approved' + Conflict reltype: 'is_similar' + Old orgs: 'hidden' + Rels new <-> old : 'merges' + Rels old <-> new : 'merged_in' +6) Resolve a conflict using a subset of suggested conflicts (discard) + Conflict reltype: 'is_different' + +# Load of new suggestion using a Dedup Workflow + The dedup wf writes the suggestions on the tmp_dedup_events at the end it calls the method /import/dedupEvents + The previous suggestions (orgs, dups and conflicts) are deleted + The suggestions are moved from the temp table according to: + 1) not(isOpenOrg(oa_original_id)) AND (oa_original_id = local_id OR isEmpty(local_id)) -> suggested org + 2) not(isOpenOrg(oa_original_id)) AND (oa_original_id != local_id OR isEmpty(local_id)) -> duplicate of a suggested org + 3) isOpenOrg(oa_original_id) AND (oa_original_id != local_id OR isEmpty(local_id)) -> duplicate of a existing openOrgs + 4) Create a group using 'group_id', it should contain only OpenOrg Ids (using oa_original_id and local_id): each couple of the group is a conflict