From 2f3e832d4d3c3af155c7bf8490d4b001202dc93f Mon Sep 17 00:00:00 2001 From: "miriam.baglioni" Date: Fri, 18 Nov 2022 17:53:03 +0100 Subject: [PATCH] [Bulk Download] first versione of the documentation --- docs/download.md | 69 +++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 65 insertions(+), 4 deletions(-) diff --git a/docs/download.md b/docs/download.md index 6a6e6a8..99ae077 100644 --- a/docs/download.md +++ b/docs/download.md @@ -4,11 +4,72 @@ sidebar_position: 4 # Bulk downloads -In order to facilitate users, different dumps are available. All are available under the Zenodo community called [OpenAIRE Research Graph](https://zenodo.org/communities/openaire-research-graph). -Here we provide detailed documentation about the full dump: +In order to facilitate users, different dumps are available under the Zenodo community called [OpenAIRE Research Graph](https://zenodo.org/communities/openaire-research-graph). +In the following the list of Dumps available: -* JSON dump: https://doi.org/10.5281/zenodo.3516917 -* JSON schema: https://doi.org/10.5281/zenodo.4238938 +* The whole OpenAIRE Research Graph Dump + + Dataset: https://doi.org/10.5281/zenodo.3516917 + + Schema: https://doi.org/10.5281/zenodo.4238938 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. +It is composed of several files so that you can download the parts you are interested into. The files are named after the entity they store (i.e. publication, dataset). Each file is at most 10GB and it is +a tar archive containing gz files, each with one json per line. + +* The OpenAIRE COVID-19 dump + + Dataset: https://doi.org/10.5281/zenodo.6638745 + + Schema: https://doi.org/10.5281/zenodo.6372977 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. + It contains metadata records of publications, research data, software and projects on the topic of Corona Virus and COVID-19. +This dump is part of the activities of OpenAIRE to support the fight against COVID-19 together with the OpenAIRE COVID-19 Gateway. +The dump consists of a tar archive containing gzip files with one json per line. + +* The dump of funded products + + Dataset: https://doi.org/10.5281/zenodo.6634431 + + Schema: https://doi.org/10.5281/zenodo.6372977 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. +It contains metadata records of research products (research literature, data, software, other types of research products) with funding +information available in the OpenAIRE Graph. Records are grouped by funder in a dedicated archive file. Each tar archive contains +gzip files, each with one json record per line. + +* The dump of delta projects + + Dataset: https://doi.org/10.5281/zenodo.7119633 + + Schema: https://doi.org/10.5281/zenodo.5799514 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. + It contains the metadata records of projects collected by OpenAIRE in a given time frame. Usually one deposition of collected projects is done for each release of the OpenAIRE Graph + The deposition is one tar archive containing gzip files, each with one json record per line. + +* The dumps about research communities, initiatives and infrastructures + + Dataset: https://doi.org/10.5281/zenodo.6638478 + + Schema: https://doi.org/10.5281/zenodo.6372977 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. +The dataset contains one file per community/initiative/infrastructure collaborating with OpenAIRE. Check out also their community gateways on + CONNECT. Each file is a tar archive containing gzip files with one json per line. The only communities/research initiative/infrastructure we dump are those visible to everyone. + +* The dump of ScholeXplorer + + Dataset: https://doi.org/10.5281/zenodo.6338616 + + Schema (Scholix version 3): https://doi.org/10.5281/zenodo.1120275 + + Schema (Scholix version 4): https://doi.org/10.5281/zenodo.6351557 + + This dataset is licensed under a CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. +The dataset contains the GZ-compressed dump of the Scholix links exposed by the OpenAIRE ScholeXplorer service. + :::note Tip!