hostedbymap #136

Merged
claudio.atzori merged 60 commits from hostedbymap into beta 3 years ago
Collaborator

This PR is related to ticket #5644

It build on demand a new hostedbymap by exploiting two external sources:

It applies the HBM on the entities in the graph, the entities involved by its application are the Datasources and the Publications.

The Datasource will be modified in the openairecompatibility field and only if its classid is set to UNKNOWN. In this case, it will be set as hostedBy, and the classname will become "collected from a compatible aggregator".

The publications that will be affected by the application of the HBM are those having only one instance. For those with journal information for which exists a match with the hbm, the hostedby information in the instance will be modified according to the information in the hbm. If in the map the journal identifier is marked as OpenAccess, the accessright of the publication will be set to OPEN and the OpenAccessRoute to hybrid. Also the bestaccessright at the level of the result is possibly changed.

This PR is related to ticket #5644 It build on demand a new hostedbymap by exploiting two external sources: - https://pub.uni-bielefeld.de/download/2944717/2944718/issn_gold_oa_version_4.csv - https://doaj.org/csv) It applies the HBM on the entities in the graph, the entities involved by its application are the Datasources and the Publications. The Datasource will be modified in the `openairecompatibility` field and only if its classid is set to UNKNOWN. In this case, it will be set as `hostedBy`, and the classname will become "collected from a compatible aggregator". The publications that will be affected by the application of the HBM are those having only one instance. For those with journal information for which exists a match with the hbm, the hostedby information in the instance will be modified according to the information in the hbm. If in the map the journal identifier is marked as `OpenAccess`, the `accessright` of the publication will be set to OPEN and the `OpenAccessRoute` to `hybrid`. Also the `bestaccessright` at the level of the result is possibly changed.
miriam.baglioni added the
enhancement
label 3 years ago
claudio.atzori was assigned by miriam.baglioni 3 years ago
miriam.baglioni added 37 commits 3 years ago
miriam.baglioni added 1 commit 3 years ago
miriam.baglioni added 1 commit 3 years ago
miriam.baglioni added 2 commits 3 years ago
claudio.atzori requested changes 3 years ago
claudio.atzori left a comment
Owner
  • It seems that many files, not related with the implementation of this new fature got changed. In particular, ISSN (uppercase) got replaced with issn (lowerase) in many files that have nothing to do with this PR. Please check and revert them to their previous state.
  • On dhp-workflows/dhp-graph-mapper/pom.xml I see a new dependency:
<dependency>
   <groupId>com.opencsv</groupId>
   <artifactId>opencsv</artifactId>
   <version>5.5</version>
</dependency>

I got nothing against the introduction of new libraries, but two comments on this

  1. didn't we already manage CSVs with an external library? If yes, why do we need yet another one?
  2. dependency versions should be declared only in the external pom file, all the submodules must refer to them without specifying the version.
  • On dhp-workflows/dhp-doiboost/pom.xml I see a new dependency towards the dhp-aggregation. In general, common utilities neeeded by more than one submodule should be moved in dhp-common, let's try to keep the dependencies well organised.
* It seems that many files, not related with the implementation of this new fature got changed. In particular, `ISSN` (uppercase) got replaced with `issn` (lowerase) in many files that have nothing to do with this PR. Please check and revert them to their previous state. * On `dhp-workflows/dhp-graph-mapper/pom.xml` I see a new dependency: ``` <dependency> <groupId>com.opencsv</groupId> <artifactId>opencsv</artifactId> <version>5.5</version> </dependency> ``` I got nothing against the introduction of new libraries, but two comments on this 1. didn't we already manage CSVs with an external library? If yes, why do we need yet another one? 2. dependency versions should be declared only in the external pom file, all the submodules must refer to them without specifying the version. * On `dhp-workflows/dhp-doiboost/pom.xml` I see a new dependency towards the `dhp-aggregation`. In general, common utilities neeeded by more than one submodule should be moved in `dhp-common`, let's try to keep the dependencies well organised.
miriam.baglioni added 14 commits 3 years ago
miriam.baglioni added 2 commits 3 years ago
miriam.baglioni added 1 commit 3 years ago
miriam.baglioni added 1 commit 3 years ago
Poster
Collaborator

I have reverted all the modified ISSN on the files. I have also removed the not needed dependecies from the poms.
The new library to import csv files should replace the previous one. A new PR will follow for refactoring

I have reverted all the modified ISSN on the files. I have also removed the not needed dependecies from the poms. The new library to import csv files should replace the previous one. A new PR will follow for refactoring
claudio.atzori added 1 commit 3 years ago
claudio.atzori merged commit e91ffcd2f3 into beta 3 years ago

Reviewers

claudio.atzori requested changes 3 years ago
The pull request has been merged as e91ffcd2f3.
You can also view command line instructions.

Step 1:

From your project repository, check out a new branch and test the changes.
git checkout -b hostedbymap beta
git pull origin hostedbymap

Step 2:

Merge the changes and update on Gitea.
git checkout beta
git merge --no-ff hostedbymap
git push origin beta
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: D-Net/dnet-hadoop#136
Loading…
There is no content yet.