First Import ============ The first import of the organizations should be performed using the sql script: first_import_grid_ac.sql 1) Download the last dump from https://www.grid.ac 2) Update the paths in the sql script 3) Launch the script If you want to add missing ROR identifiers: 1) Download ror.json from https://figshare.com/collections/ROR_Data/4596503 2) Update the paths in prepare_grid_ror_update.pl and update_ror_ids.sql 3) Launch prepare_grid_ror_update.pl 4) Launch update_ror_ids.sql NB: The grid.ac dump is richer then ror dump, Ror does not consider some fiels (city, lat, lng) and hierarchical relationships among the organizations. If grid.ac will be DEPRACATED we'll start using the import from ror (a script is available: prepare_import_ror.pl) General Description =================== # Schema Main table: organizations Tables for Multiple properties: acronyms, urls, other_ids, other_names Tables for vocabularies: countries, languages, id_types, org_types, relationships (ie: child, parent, merged_in, merges, ...) Table for conflicts and duplicates: oa_conflicts, oa_duplicates Specific Views for the UI: organizations_view organizations_simple_view organizations_info_view suggestions_info_by_country_view oa_duplicates_view conflict_groups_view duplicate_groups_view To manage authorizations: users, user_roles, user_countries, users_view (VIEW) Other: organizations_id_seq (SEQUENCE to generate new OpenOrg IDs), org_index_search (for fulltext search), tmp_dedup_events (to import new suggestion from DedupWF) # User Roles User: He can work only on organizations of specific countries He can edit metadata of approved organizations He can manage duplicates National Admin: All the User rights He can work only on organizations of specific countries He can approve/register organizations He can manage conflicts He can approve users of his own countries Super Admin: All the National Admin rights, but for all countries # Actions 1) Create a new org from scratch The ID is a valid OpenOrgId (generated by the system) The status is 'approved' 2) Approve a suggested org (prefix: pending_org_::) ID: A new org is created with OpenOrg Id and status='approved' Copy the duplicates from old to new organizations (status will be 'suggested') The pending org is deleted 3) Approve a suggested duplicate (the status of the duplicates is always 'raw') in oa_duplicates: reltype = 'is_similar' 4) Discard a suggested duplicate in oa_duplicates: reltype = 'is_different' 5) Resolve a conflict using a subset of suggested conflicts (approve) Generate a new org New org status: 'approved' Conflict reltype: 'is_similar' Old orgs status: 'hidden' Rels new <-> old : 'merges' Rels old <-> new : 'merged_in' 6) Resolve a conflict using a subset of suggested conflicts (discard) Conflict reltype: 'is_different' # Load of new suggestion using a Dedup Workflow The dedup wf writes the suggestions on the tmp_dedup_events at the end it calls the method /import/dedupEvents The previous suggestions (orgs, dups and conflicts) are deleted The suggestions are moved from the temp table according to: 1) not(isOpenOrg(oa_original_id)) AND (oa_original_id = local_id OR isEmpty(local_id)) -> new suggested org with id = 'pending_org_::...' 2) not(isOpenOrg(oa_original_id)) AND (oa_original_id != local_id OR isEmpty(local_id)) -> duplicate of a suggested org 3) isOpenOrg(oa_original_id) AND (oa_original_id != local_id OR isEmpty(local_id)) -> duplicate of a existing openOrgs 4) Create a group using 'group_id', it should contain only OpenOrg Ids (using oa_original_id and local_id): each couple of the group is a conflict