Enrich authors with ORCID info using new matching algorithm #398

Merged
giambattista.bloisi merged 2 commits from new_orcid_enhancement into beta 1 month ago
Collaborator

The new author enriching strategy for ORCID adopts a multi-pass strategy where different matching algorithms algorithms are applied in order of their expected reliability (from highest to lowest), so that next pass will work on remaining unmatched authors of its previous passes:

  1. exact fullname match, reconstruct ORCID fullname as givenName + familyName
  2. reversed fullname match, reconstruct ORCID fullname as familyName + givenName
  3. split author names in tokens, order the tokens, then check for matches of ordered full tokens or abbreviations
  4. exact matches of ORCID creditName
  5. exact matches in ORCID otherNames
The new author enriching strategy for ORCID adopts a multi-pass strategy where different matching algorithms algorithms are applied in order of their expected reliability (from highest to lowest), so that next pass will work on remaining unmatched authors of its previous passes: 1. exact fullname match, reconstruct ORCID fullname as givenName + familyName 2. reversed fullname match, reconstruct ORCID fullname as familyName + givenName 3. split author names in tokens, order the tokens, then check for matches of ordered full tokens or abbreviations 4. exact matches of ORCID creditName 5. exact matches in ORCID otherNames
giambattista.bloisi added 1 commit 2 months ago
giambattista.bloisi requested review from sandro.labruzzo 2 months ago
giambattista.bloisi requested review from claudio.atzori 2 months ago
giambattista.bloisi requested review from miriam.baglioni 2 months ago
Collaborator

To me it is OK

To me it is OK
giambattista.bloisi added 1 commit 1 month ago
giambattista.bloisi merged commit 3f22c101d9 into beta 1 month ago

Reviewers

sandro.labruzzo was requested for review 2 months ago
claudio.atzori was requested for review 2 months ago
miriam.baglioni was requested for review 2 months ago
The pull request has been merged as 3f22c101d9.
You can also view command line instructions.

Step 1:

From your project repository, check out a new branch and test the changes.
git checkout -b new_orcid_enhancement beta
git pull origin new_orcid_enhancement

Step 2:

Merge the changes and update on Gitea.
git checkout beta
git merge --no-ff new_orcid_enhancement
git push origin beta
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: D-Net/dnet-hadoop#398
Loading…
There is no content yet.