diff --git a/docs/data-provision/deduplication/organizations.md b/docs/data-provision/deduplication/organizations.md index 2fe5ff4..0d4902b 100644 --- a/docs/data-provision/deduplication/organizations.md +++ b/docs/data-provision/deduplication/organizations.md @@ -46,6 +46,8 @@ The comparison goes through the following decision tree: Organization Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/1YKInGGtHu09QG4pT2gRLEum4LxU82d4nKkvGNvRQmrg/edit?usp=sharing) + ### Data Curation All the similarity relations drawn by the algorithm involving the decision tree are exposed in OpenOrgs, where are made available to the data curators to give feedbacks and to improve the organizations metadata. diff --git a/docs/data-provision/deduplication/research-products.md b/docs/data-provision/deduplication/research-products.md index 7b33ca8..3000e24 100644 --- a/docs/data-provision/deduplication/research-products.md +++ b/docs/data-provision/deduplication/research-products.md @@ -37,6 +37,8 @@ The comparison goes through different stages: Publications Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/19SIilTp1vukw6STMZuPMdc0pv0ODYCiOxP7OU3iPWK8/edit?usp=sharing) + #### Software For each pair of software in a cluster the following strategy (depicted in the figure below) is applied. The comparison goes through different stages: @@ -48,6 +50,8 @@ The comparison goes through different stages: Software Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/19gd1-GTOEEo6awMObGRkYFhpAlO_38mfbDFFX0HAkuo/edit?usp=sharing) + #### Datasets and Other types of research products For each pair of datasets or other types of research products in a cluster the strategy depicted in the figure below is applied. The decision tree is almost identical to the publication decision tree, with the only exception of the *instance type check* stage. Since such type of record does not have a relatable instance type, the check is not performed and the decision tree node is skipped. @@ -56,6 +60,8 @@ The decision tree is almost identical to the publication decision tree, with the Dataset and Other types of research products Decision Tree

+[//]: # (Link to the image: https://docs.google.com/drawings/d/1uBa7Bw2KwBRDUYIfyRr_Keol7UOeyvMNN7MPXYLg4qw/edit?usp=sharing) + ### Duplicates grouping (transitive closure) The general concept is that the field coming from the record with higher "trust" value is used as reference for the field of the representative record. diff --git a/static/img/docs/decisiontree-dataset-orp.png b/static/img/docs/decisiontree-dataset-orp.png index 4b060b3..cf12130 100644 Binary files a/static/img/docs/decisiontree-dataset-orp.png and b/static/img/docs/decisiontree-dataset-orp.png differ diff --git a/static/img/docs/decisiontree-organization.png b/static/img/docs/decisiontree-organization.png index 11d744d..c3a2a56 100644 Binary files a/static/img/docs/decisiontree-organization.png and b/static/img/docs/decisiontree-organization.png differ diff --git a/static/img/docs/decisiontree-publication.png b/static/img/docs/decisiontree-publication.png index 030c478..aa70343 100644 Binary files a/static/img/docs/decisiontree-publication.png and b/static/img/docs/decisiontree-publication.png differ diff --git a/static/img/docs/decisiontree-software.png b/static/img/docs/decisiontree-software.png index c6db2b7..23c6812 100644 Binary files a/static/img/docs/decisiontree-software.png and b/static/img/docs/decisiontree-software.png differ