diff --git a/docs/data-provision/deduplication/organizations.md b/docs/data-provision/deduplication/organizations.md index 2fe5ff4f..0d4902ba 100644 --- a/docs/data-provision/deduplication/organizations.md +++ b/docs/data-provision/deduplication/organizations.md @@ -46,6 +46,8 @@ The comparison goes through the following decision tree: Organization Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/1YKInGGtHu09QG4pT2gRLEum4LxU82d4nKkvGNvRQmrg/edit?usp=sharing) + ### Data Curation All the similarity relations drawn by the algorithm involving the decision tree are exposed in OpenOrgs, where are made available to the data curators to give feedbacks and to improve the organizations metadata. diff --git a/docs/data-provision/deduplication/research-products.md b/docs/data-provision/deduplication/research-products.md index 7b33ca8f..3000e24b 100644 --- a/docs/data-provision/deduplication/research-products.md +++ b/docs/data-provision/deduplication/research-products.md @@ -37,6 +37,8 @@ The comparison goes through different stages: Publications Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/19SIilTp1vukw6STMZuPMdc0pv0ODYCiOxP7OU3iPWK8/edit?usp=sharing) + #### Software For each pair of software in a cluster the following strategy (depicted in the figure below) is applied. The comparison goes through different stages: @@ -48,6 +50,8 @@ The comparison goes through different stages: Software Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/19gd1-GTOEEo6awMObGRkYFhpAlO_38mfbDFFX0HAkuo/edit?usp=sharing) + #### Datasets and Other types of research products For each pair of datasets or other types of research products in a cluster the strategy depicted in the figure below is applied. The decision tree is almost identical to the publication decision tree, with the only exception of the *instance type check* stage. Since such type of record does not have a relatable instance type, the check is not performed and the decision tree node is skipped. @@ -56,6 +60,8 @@ The decision tree is almost identical to the publication decision tree, with the Dataset and Other types of research products Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/1uBa7Bw2KwBRDUYIfyRr_Keol7UOeyvMNN7MPXYLg4qw/edit?usp=sharing) + ### Duplicates grouping (transitive closure) The general concept is that the field coming from the record with higher "trust" value is used as reference for the field of the representative record. diff --git a/static/img/docs/decisiontree-dataset-orp.png b/static/img/docs/decisiontree-dataset-orp.png index 4b060b3b..cf121309 100644 Binary files a/static/img/docs/decisiontree-dataset-orp.png and b/static/img/docs/decisiontree-dataset-orp.png differ diff --git a/static/img/docs/decisiontree-organization.png b/static/img/docs/decisiontree-organization.png index 11d744db..c3a2a56a 100644 Binary files a/static/img/docs/decisiontree-organization.png and b/static/img/docs/decisiontree-organization.png differ diff --git a/static/img/docs/decisiontree-publication.png b/static/img/docs/decisiontree-publication.png index 030c478b..aa703438 100644 Binary files a/static/img/docs/decisiontree-publication.png and b/static/img/docs/decisiontree-publication.png differ diff --git a/static/img/docs/decisiontree-software.png b/static/img/docs/decisiontree-software.png index c6db2b72..23c6812b 100644 Binary files a/static/img/docs/decisiontree-software.png and b/static/img/docs/decisiontree-software.png differ