openaire-graph-docs/docs/data-provision/finalisation.md

19 lines
1.1 KiB
Markdown
Raw Permalink Normal View History

2022-12-20 16:55:04 +01:00
# Finalisation
2022-09-09 17:38:08 +02:00
At the very end of the graph production workflow, a step is dedicated to perform certain finalisation operations, that we describe in this page,
aiming to improve the overall quality of the data.
2022-12-21 19:52:33 +01:00
The output of this final step is the final version of the OpenAIRE Research Graph.
## Filtering
2022-12-21 19:52:33 +01:00
Bibliographic records that do not meet minimal requirements for being part of the OpenAIRE Research Graph are eliminated during this phase.
Currently, the only criteria applied horizontally to the entire graph aims at excluding scientific results whose title is not meaningful for citation purposes.
Then, different criteria are applied in the pre-processing of specific sub-collections:
2022-12-20 16:55:04 +01:00
* [Crossref filtering](/data-provision/aggregation/non-compatible-sources/doiboost#crossref-filtering)
## Country cleaning
This phase is responsible for removing the country information from result records that match specific criteria. The need for this phase is driven by the fact that some datasources, although referred of national pertinence, they contain material that is not always related to the given country.
2022-09-09 17:38:08 +02:00