3.6 KiB
Open Researcher and Contributor ID (ORCID)
ORCID (Open Researcher and Contributor ID) is a non-profit organization that provides a unique identifier for researchers. ORCID iDs are used to connect researchers with their contributions, such as publications, grants, and affiliations.
This document describes how OpenAIRE collects information about the researcher profiles and their works from the ORCID.
Data acquisition
The ORCID full dataset can be downloaded publicly from Figshare and are described on the ORCID website. These datasets represented the initial import, whereas to keep up with the updates in the data a scheduled process retrieves the delta regularly.
The ORCID dataset consists in different compressed files containing information about researchers in XML format. Once uncompressed, the information extracted from the XML records was used to populate the three tables described below.
ORCID provides an API to get incremental updates, the parsed incremental data can be used to update the three tables with the latest changes.
OpenAIRE ORCID Data model
- Authors: This table contains information about ORCID authors, including their ORCID ID, name, fullname, other names, employments, works, and ROAR IDs.
- Employments: This table contains information about the employments of ORCID authors, including their ORCID ID, organization, start date, end date, and ROAR ID.
- Works: This table contains information about the works of ORCID authors, including te paper PID and ORCID ID.
Authors
Column name | Type |
---|---|
biography |
string |
creditName |
string |
familyName |
string |
givenName |
string |
orcid |
string |
otherNames |
array[string] |
otherPids |
array[struct[schema:string, value:string]] |
visibility |
string |
lastModifiedDate |
string |
Employments
Column name | Type |
---|---|
affiliationId |
struct[schema:string, value:string] |
departmentName |
string |
endDate |
string |
orcid |
string |
roleTitle |
string |
startDate |
string |
Works
Column name | Type |
---|---|
orcid |
string |
pids |
array[struct[schema:string, value:string]] |
title |
string |
For a more extensive description of the different fields and the schema of the record model please refer to the ORCID project on GitHub.
Process
The information obtained by ORCID is used to enrich the Graph, in particular to add the author identifiers to the results not providing one. This process is described in the enrichment by PID section.