XML serialisation of instances with the same URLs #167

Merged
claudio.atzori merged 2 commits from instance_group_by_url into beta 2 years ago
Owner

This PR integrates the changes needed to adapt the instance XML serialization proceduce as described in #7156

Instances with multiple URLs

  • We can choose one, because ARC told us that they are only displaying the first one anyway.
  • Decision : CNR can pick one URL for the instance based on one of the following strategies: 1. take the first, for simplicity we start with this approach. 2. take a URL which is a DOI, a Handle (or anything else we can recognise). if not available, fallback on option 1.

Instances with different URL that resolve to the same page
Decision: we cannot do anything about it. We live with it.

Instances with the same URL

  • After the previous step, each instance has one URL, but the URL can be the same among different instances.
    Decision : merge all instances with the same URL into one.

Merged instances will feature

  • most open accessright
  • multiple collectedfrom (to be shown as comma separated in the Provider)
  • multiple hostedby - the unknown repository is ruled out when a valid repo exist
  • apc sum and currency will not be repeatable. If, for reasons we cannot imagine, two different APC information are attached to the same URL, then the last one will win.
  • all other fields: allow multiple unique values
This PR integrates the changes needed to adapt the instance XML serialization proceduce as described in [#7156](https://support.openaire.eu/issues/7156) **Instances with multiple URLs** * We can choose one, because ARC told us that they are only displaying the first one anyway. * Decision : CNR can pick one URL for the instance based on one of the following strategies: 1. take the first, for simplicity we start with this approach. 2. take a URL which is a DOI, a Handle (or anything else we can recognise). if not available, fallback on option 1. **Instances with different URL that resolve to the same page** **Decision**: we cannot do anything about it. We live with it. **Instances with the same URL** * After the previous step, each instance has one URL, but the URL can be the same among different instances. Decision : merge all instances with the same URL into one. Merged instances will feature * most open accessright * multiple collectedfrom (to be shown as comma separated in the Provider) * multiple hostedby - the unknown repository is ruled out when a valid repo exist * apc sum and currency will not be repeatable. If, for reasons we cannot imagine, two different APC information are attached to the same URL, then the last one will win. * all other fields: allow multiple unique values
claudio.atzori added the
enhancement
label 2 years ago
alessia.bardi was assigned by claudio.atzori 2 years ago
miriam.baglioni was assigned by claudio.atzori 2 years ago
claudio.atzori self-assigned this 2 years ago
claudio.atzori added 1 commit 2 years ago
claudio.atzori added 1 commit 2 years ago
claudio.atzori merged commit 372633880f into beta 2 years ago
The pull request has been merged as 372633880f.
You can also view command line instructions.

Step 1:

From your project repository, check out a new branch and test the changes.
git checkout -b instance_group_by_url beta
git pull origin instance_group_by_url

Step 2:

Merge the changes and update on Gitea.
git checkout beta
git merge --no-ff instance_group_by_url
git push origin beta
Sign in to join this conversation.
No reviewers
No Milestone
No project
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: D-Net/dnet-hadoop#167
Loading…
There is no content yet.