Update page 'Data provision workflow'

Claudio Atzori 2020-03-02 18:07:29 +01:00
parent b259d5be40
commit 0adb200894
1 changed files with 7 additions and 0 deletions

@ -0,0 +1,7 @@
The data provision workflow is a sequence of processing steps aimed at updating the content of the backends serving the OpenAIRE public services. Currently it covers
* Apache Solr fulltext index serving content to [explore.openaire.eu](https://explore.openaire.eu/) and to the [HTTP search API](http://api.openaire.eu/api.html)
* Databases accessed through Apache Impala for calculation of statistics over the Graph
* MongoDB noSQL database serving content to the [OpenAIRE OAI-PMH endpoint](http://api.openaire.eu/oai_pmh)
This document provides a coarse grained description of the data provision workflow, it is composed of several data movement and manipulation steps:
* Load data from the aggregator and map it to the OCEAN cluster according to the dhp.Oaf model. The procedure freezes the content stored in the aggregation system backends to produce the so called raw graph **G_r**