documented first phase of ORCID

This commit is contained in:
Sandro La Bruzzo 2024-07-23 14:46:15 +02:00
parent 584abf5a42
commit 749124253d
1 changed files with 19 additions and 0 deletions

View File

@ -1,7 +1,26 @@
# Open Researcher and Contributor ID (ORCID)
ORCID (Open Researcher and Contributor ID) is a non-profit organization that provides a unique identifier for researchers. ORCID iDs are used to connect researchers with their contributions, such as publications, grants, and affiliations.
This document describes how to collect ORCID data from the ORCID datasource.
## Data acquisition
### Full ORCID Dump
The ORCID dump can be downloaded from the ORCID website https://support.orcid.org/hc/en-us/articles/360006897394-How-do-I-get-the-public-data-file.
The ORCID dump consists in different compressed files that needs to be extracted.
This compressed file contains information on researchers in XML format. Once extracted, they will be parsed to populate the three tables described below.
### Incremental Updates
ORCID provides an API to get incremental updates,the parsed incremental data can be used to update the three tables with the latest changes.
### OpenAIRE ORCID Data model
- **Authors**: This table contains information about ORCID authors, including their ORCID ID, name, fullname, other names, employments, works, and ROAR IDs.
- **Employments**: This table contains information about the employments of ORCID authors, including their ORCID ID, organization, start date, end date, and ROAR ID.
- **Works**: This table contains information about the works of ORCID authors, including te paper PID and ORCID ID.
## Process
In the following we describe the process applied to the ORCID contents.