[Orcid Enrichment] Mixing first part oa alternative and this one. Remove As you can see from the last phrase in the text

This commit is contained in:
Miriam Baglioni 2024-07-26 12:39:17 +02:00
parent 891c66a9db
commit da4568b7c9
1 changed files with 14 additions and 12 deletions

View File

@ -1,17 +1,17 @@
# Enrichment from ORCID
OpenAIRE collects the ORCID dataset and exploits it to enrich the metadata of the results by adding the persistent
identifier to the authors.
OpenAIRE enhances publication metadata by incorporating author information from ORCID. This involves adding persistent
identifiers to authors and leveraging ORCID data to improve author disambiguation.
## How does the enrichment works?
The following steps describe the pipeline to enrich the author information in the graph by including the orcid identifiers from ORCID.
The following steps outline how ORCID information is integrated into the OpenAIRE Graph:
### Extracting Author and Work Information and creating ORCID-Work pairs
OpenAIRE extracts the following information from each ORCID profile:
- Author information: ORCID, family name, given name, other names, and credit name.
- Work information: Persistent identifiers (DOI, PMC, PMID, arXiv, handle) associated with the profile.
OpenAIRE extracts the following from ORCID profiles:
* Author information: ORCID, family name, given name, other names, credit name
* Work information: Persistent identifiers (DOI, PMC, PMID, arXiv, handle)
For each work identified by a persistent identifier (PID), a pair is created linking the ORCID to the work PID. For
example, if an ORCID profile (orcid1) has a DOI (doi1) and a PMC (pmc1) associated with it, the following pairs are generated:
@ -19,12 +19,14 @@ example, if an ORCID profile (orcid1) has a DOI (doi1) and a PMC (pmc1) associat
- P2: <orcid1, pmc1>
### Grouping by work persistent identifier
Once all ORCID-Work pairs are created, they are grouped by the work's persistent identifier. This allows identification
of multiple authors contributing to the same work. For instance, if two ORCIDs (orcid1 and orcid2) are associated with
the same DOI (doi1), the structure <doi1, [orcid1, orcid2]> is created
ORCID-Work pairs are grouped by the work's persistent identifier to identify multiple authors contributing to the same work.
Two ORCIDs (orcid1 and orcid2) associated with the same DOI (doi1), result in structures like:
* `<doi1, [orcid1, orcid2]>`
Note: The term "orcidx" refers to a structure containing the ORCID identifier along with the author's name information
(family name, given name, other names, and credit name) as extracted from the ORCID profile. The term "doix" refer to a structure
**Note:**
* The term "orcidx" refers to a structure containing the ORCID identifier along with the author's name information
(family name, given name, other names, and credit name) as extracted from the ORCID profile.
* The term "doix" refer to a structure
containing the schema and value of the persistent identifier. In case of the example "doix" : <"doi","10....">
### Matching with the Graph result and enriching the author metadata
@ -130,5 +132,5 @@ Example:
graph = Mario Enrico Rossi, Mario Rossi
ORCID = Mario Rossi
As you can see applying only the third strategy, we would associate Mario Rossi's ORCID to Mario Fabrizio Rossi if this one was first in the author list.
Applying only the third strategy, we would associate Mario Rossi's ORCID to Mario Fabrizio Rossi if this one was first in the author list.