author name parsing
#220
Merged
claudio.atzori
merged 3 commits from author_name_particles
into beta
2 years ago
Loading…
Reference in New Issue
There is no content yet.
Delete Branch 'author_name_particles'
Deleting a branch is permanent. It CANNOT be undone. Continue?
When parsing author names a file name_particles.txt is used to identify particles like
van, der, Dr., Mr.
etc.. and get rid of them when identifying the name and the surname. This file I suspect it was left behind in some refactoring and in fact, was not being used. This PR moves it again in the proper classpath location and introduces few non functional improvements in its implementation.Via the helpdesk (ticket #181), a user confirmed that the names with particles (which she calls "insertions") are:
And that this applies not only to Dutch names but also to names of other countries (e.g. von)
cba9c2b7cc
into beta 2 years agoWe thought we solve the problem, but the user reported again some problems with her names in some cases.
She did a search by ORCID id and sometimes her name is still wrong:
https://explore.openaire.eu/search/advanced/research-outcomes?f0=authorid&fv0=0000-0003-3272-8007
cba9c2b7cc
.Step 1:
From your project repository, check out a new branch and test the changes.Step 2:
Merge the changes and update on Gitea.