diff --git a/docs/assets/openaire-red-badge.png b/docs/assets/openaire-red-badge.png new file mode 100644 index 0000000..e346dca Binary files /dev/null and b/docs/assets/openaire-red-badge.png differ diff --git a/docs/data-model/data-model.md b/docs/data-model/data-model.md index ae9d453..50a6a52 100644 --- a/docs/data-model/data-model.md +++ b/docs/data-model/data-model.md @@ -2,7 +2,7 @@ The OpenAIRE Graph comprises several types of [entities](../category/entities) and [relationships](./relationships) among them. -The latest version of the JSON schema can be found on [Bulk downloads](../download). +The latest version of the JSON schema can be found on the [Downloads](../downloads/full-graph) section.

Data model diff --git a/docs/download.md b/docs/download.md deleted file mode 100644 index 6a6e6a8..0000000 --- a/docs/download.md +++ /dev/null @@ -1,17 +0,0 @@ ---- -sidebar_position: 4 ---- - -# Bulk downloads - -In order to facilitate users, different dumps are available. All are available under the Zenodo community called [OpenAIRE Research Graph](https://zenodo.org/communities/openaire-research-graph). -Here we provide detailed documentation about the full dump: - -* JSON dump: https://doi.org/10.5281/zenodo.3516917 -* JSON schema: https://doi.org/10.5281/zenodo.4238938 - -:::note Tip! - -For a visual and interactive overview of the JSON schema, we suggest to use a JSON schema viewer like [jsonschemaviewer](https://navneethg.github.io/jsonschemaviewer/) (you just need to copy the schema and then you can easily navigate through the nodes). - -::: diff --git a/docs/downloads/alternative-model/cfhb.md b/docs/downloads/alternative-model/cfhb.md new file mode 100644 index 0000000..db13233 --- /dev/null +++ b/docs/downloads/alternative-model/cfhb.md @@ -0,0 +1,30 @@ +--- + +sidebar_position: 1 + +--- + +# CfHbKeyValue + +Information about the sources from which the record has been collected. + + + @JsonSchema(description = "the OpenAIRE identifier of the data source") +### key +_Type: String • Cardinality: ONE_ + +the OpenAIRE identifier of the data source + +```json +"key":"10|openaire____::081b82f96300b6a6e3d282bad31cb6e2" +``` + +### value +_Type: String • Cardinality: ONE_ + +The name of the data source. + +```json +"value":"Crossref" +``` + diff --git a/docs/downloads/alternative-model/communityInstance.md b/docs/downloads/alternative-model/communityInstance.md new file mode 100644 index 0000000..0ec83ca --- /dev/null +++ b/docs/downloads/alternative-model/communityInstance.md @@ -0,0 +1,37 @@ +--- + +sidebar_position: 1 + +--- + +# CommunityInstance + +It is a subclass of [Instance](../../data-model/entities/result#instance) extended with information regarding the collection and hosting source for this materialization of the result. + +### hostedby +_Type: [CfHbKeyValue](./cfhb) • Cardinality: ONE_ + +Information about the source from which the instance can be viewed or downloaded. + +```json + +"hostedby": { + "key": "10|issn___print::35ee75a5ad42581d604be113a8f56427", + "value": "New Phytologist" + }, + +``` + +### collectedfrom +_Type: [CfHbKeyValue](./cfhb) • Cardinality: ONE_ + +Information about the source from which the record has been collected + + +```json + +"collectedfrom": { + "key": "10|openaire____::081b82f96300b6a6e3d282bad31cb6e2", + "value": "Crossref" + } +``` \ No newline at end of file diff --git a/docs/downloads/alternative-model/context.md b/docs/downloads/alternative-model/context.md new file mode 100644 index 0000000..2684d85 --- /dev/null +++ b/docs/downloads/alternative-model/context.md @@ -0,0 +1,46 @@ +--- + +sidebar_position: 1 + +--- + +# Context + +Information related to research initiative/community (RI/RC) related to the result. + +### code +_Type: String • Cardinality: ONE_ + +Code identifying the RI/RC. + +```json +"code":"sdsn-gr" + +``` + + +### label +_Type: String • Cardinality: ONE_ + +Label of the RI/RC. + +```json +"label":"SDSN - Greece" +``` + +### provenance +_Type: [Provenance](../../../data-model/entities/other#provenance-2) • Cardinality: MANY_ + +Why this result is associated to the RI/RC. + +```json + +"provenance":[{ + "provenance":"Inferred by OpenAIRE", + "trust":"0.9" + }, + ... + ] + +``` + diff --git a/docs/downloads/alternative-model/extendedresult.md b/docs/downloads/alternative-model/extendedresult.md new file mode 100644 index 0000000..d9e9b0c --- /dev/null +++ b/docs/downloads/alternative-model/extendedresult.md @@ -0,0 +1,141 @@ +--- + +sidebar_position: 1 + +--- + + +# Extended Result + + +It is a subclass of [Result](../../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources. + + + +### projects + +_Type: [Project](project.md) • Cardinality: MANY_ + + +List of projects (i.e. grants) that (co-)funded the production of the research results. + + +```json + + +"projects": [ + { + "id": "40|corda__h2020::94c4a066401e22002c4811a301bb4655", + "code": "727929", + "acronym": "TomRes", + "title": "A NOVEL AND INTEGRATED APPROACH TO INCREASE MULTIPLE AND COMBINED STRESS TOLERANCE IN PLANTS USING TOMATO AS A MODEL", + "funder": { + "shortName": "EC", + "name": "European Commission", + "jurisdiction": "EU", + "fundingStream": "H2020" + }, + "provenance": { + "provenance": "Harvested", + "trust": "0.900000000000000022" + }, + "validated": { + "validationDate": "2021-0101", + "validatedByFunder": true + } + }, + ... + ] + +``` + +### context + +_Type: [Context](./context) • Cardinality: MANY_ + + +Reference to relevant research infrastructure, initiative or communities (RI/RC) among those collaborating with OpenAIRE. Please see https://connect.openaire.eu that are publicly visible. + + +```json + + +"context":[ + { + "code":"sdsn-gr", + "label":"SDSN - Greece", + "provenance":[ + { + "provenance":"Inferred by OpenAIRE", + "trust":"0.9" + } + ] + }, + ... + ] + +``` + + + +### collectedfrom + +_Type: [CfHbKeyValue](./cfhb) • Cardinality: MANY_ + + +Information about the sources from which the record has been collected. + + +```json + +"collectedfrom":[ + { + "key":"10|openaire____::081b82f96300b6a6e3d282bad31cb6e2", + "value":"Crossref" + }, + ... + ] + +``` + + +### instance + +_Type: [CommunityInstance](./communityInstance) • Cardinality: MANY_ + +Information about the source from which the instance can be viewed or downloaded. + +```json + + +"instance": [ + { + "license": "http://doi.wiley.com/10.1002/tdm_license_1.1", + "accessright": { + "code": "c_16ec", + "label": "RESTRICTED", + "scheme": "http://vocabularies.coar-repositories.org/documentation/access_rights/", + "openAccessRoute": null + }, + "type": "Article", + "url": [ + "https://api.wiley.com/onlinelibrary/tdm/v1/articles/10.1111%2Fnph.15014", + "http://onlinelibrary.wiley.com/wol1/doi/10.1111/nph.15014/fullpdf", + "http://dx.doi.org/10.1111/nph.15014" + ], + "publicationdate": "2018-02-09", + "refereed": "UNKNOWN", + "hostedby": { + "key": "10|issn___print::35ee75a5ad42581d604be113a8f56427", + "value": "New Phytologist" + }, + "collectedfrom": { + "key": "10|openaire____::081b82f96300b6a6e3d282bad31cb6e2", + "value": "Crossref" + } + }, + ... + ] + + +``` diff --git a/docs/downloads/alternative-model/funder.md b/docs/downloads/alternative-model/funder.md new file mode 100644 index 0000000..1da93a9 --- /dev/null +++ b/docs/downloads/alternative-model/funder.md @@ -0,0 +1,72 @@ +--- + +sidebar_position: 1 + +--- + +# Funder + + +Information about the funder funding the project. + + +### fundingStream + +_Type: String • Cardinality: ONE_ + + +Funding information for the project. + + +```json + +"funding_stream": "H2020" + + +``` + +### jurisdiction + +_Type: String • Cardinality: ONE_ + + +Geographical jurisdiction (e.g. for European Commission is EU, for Croatian Science Foundation is HR). + + +```json + +"jurisdiction": "EU" + +``` + + +### name + +_Type: String • Cardinality: ONE_ + + +The name of the funder. + + +```json + +"name": "European Commission" + +``` + + +### shortName + +_Type: String • Cardinality: ONE_ + + +The short name of the funder. + + +```json + +"shortName": "EC" + +``` + + diff --git a/docs/downloads/alternative-model/project.md b/docs/downloads/alternative-model/project.md new file mode 100644 index 0000000..774b487 --- /dev/null +++ b/docs/downloads/alternative-model/project.md @@ -0,0 +1,134 @@ +--- + +sidebar_position: 1 + +--- + + + +# Project + + +The information about the projects related to the result. + + +### id + +_Type: String • Cardinality: ONE_ + + +Main entity identifier, created according to the [OpenAIRE entity identifier and PID mapping policy](../../data-model/pids-and-identifiers). + + +```json + +"id": "40|corda__h2020::70ea22400fd890c5033cb31642c4ae68" + +``` + + +### code + +_Type: String • Cardinality: ONE_ + + +Τhe grant agreement code of the project. + + +```json + +"code": "777541" + +``` + + +### acronym + +_Type: String • Cardinality: ONE_ + + +Project's acronym. + + +```json + +"acronym": "OpenAIRE-Advance" + +``` + + +### title + +_Type: String • Cardinality: ONE_ + + +Project's title. + + +```json + +"title": "OpenAIRE Advancing Open Scholarship" + +``` + + +### funder + +_Type [Funder](funder.md) • Cardinality: ONE_ + + +Information about the funder funding the project. + + +```json + + +"funder": { + "shortName": "EC", + "name": "European Commission", + "jurisdiction": "EU", + "fundingStream": "H2020" + } + + +``` + +### provenace + + +_Type [Provenance](../../data-model/entities/other#provenance-2) • Cardinality: ONE_ + + +The reason why the project is associated to the result. + + +```json + + +"provenance": { + "provenance": "Harvested", + "trust": "0.900000000000000022" + } + +``` + + +### validated + + +_Type [Validated](validated.md) • Cardinality: ONE_ + + +Specifies it the association between the project and the result was validated. + + +```json + + +"validated": { + "validationDate": "2021-0101", + "validatedByFunder": true + } + +``` + diff --git a/docs/downloads/alternative-model/validated.md b/docs/downloads/alternative-model/validated.md new file mode 100644 index 0000000..e92b2c9 --- /dev/null +++ b/docs/downloads/alternative-model/validated.md @@ -0,0 +1,41 @@ +--- + +sidebar_position: 1 + +--- + +# Validated + + +Information about the validtion of the association between the result and the funding information. + + +### validationDate + +_Type: String • Cardinality: ONE_ + + +When OpenAIRE collected the association between the funding and the result from an authoritative source (i.e. Sygma). + + +```json + +"validationDate": "2021-0101" + +``` + + +### validatedByFunder + +_Type: Boolean • Cardinality: ONE_ + + +Specifies if the validation comes from the funder. + + +```json + + +"validatedByFunder": true + +``` \ No newline at end of file diff --git a/docs/downloads/beginners-kit.md b/docs/downloads/beginners-kit.md new file mode 100644 index 0000000..5bb8548 --- /dev/null +++ b/docs/downloads/beginners-kit.md @@ -0,0 +1,6 @@ +--- +sidebar_position: 2 +--- + +# Beginners kit + diff --git a/docs/downloads/full-graph.md b/docs/downloads/full-graph.md new file mode 100644 index 0000000..870629b --- /dev/null +++ b/docs/downloads/full-graph.md @@ -0,0 +1,35 @@ +--- +sidebar_position: 1 +--- + +# Full graph dump + +You can download the full OpenAIRE Research Graph Dump as well as its schema from the following links: + + Dataset: https://doi.org/10.5281/zenodo.3516917 + + Schema: https://doi.org/10.5281/zenodo.4238938 + +The schema used to dump this dataset mirrors the one described in the [Data Model](../data-model). +This dataset is licensed under a Creative Commons Attribution 4.0 International License. +It is composed of several files so that you can download the parts you are interested into. The files are named after the entity they store (i.e. publication, dataset). Each file is at most 10GB and it is +a tar archive containing gz files, each with one json per line. + +## How to acknowledge this work + +Open Science services are open and transparent and survive thanks to your active support and to the visibility and reward they gather. If you use one of the [OpenAIRE Graph dumps](https://doi.org/10.5281/zenodo.3516917) for your research, please provide a proper citation following the recommendation that you find on the dump's Zenodo page or as provided below. + +:::note How to cite + +Manghi P., Atzori C., Bardi A., Baglioni M., Schirrwagen J., Dimitropoulos H., La Bruzzo S., Foufoulas I., Mannocci A., Horst M., Czerniak A., Kiatropoulou K., Kokogiannaki A., De Bonis M., Artini M., Ottonello E., Lempesis A., Ioannidis A., Manola N., Principe P. (2022). "OpenAIRE Research Graph Dump", *Dataset*, Zenodo. [doi:10.5281/zenodo.3516917](https://doi.org/10.5281/zenodo.3516917) ([BibTex](/bibtex/OpenAIRE_Research_Graph_dump.bib)) +::: + +Please also consider citing [other relevant research products](/publications#relevant-research-products) that can be of interest. + +Also consider adding one of the following badges to your service with the appropriate link to [our website](https://graph.openaire.eu): + +

+ + Openaire badge + +

diff --git a/docs/downloads/related-datasets.md b/docs/downloads/related-datasets.md new file mode 100644 index 0000000..93f112c --- /dev/null +++ b/docs/downloads/related-datasets.md @@ -0,0 +1,18 @@ +--- +sidebar_position: 4 +--- + +# Other related datasets + +In this page, we list other related datasets; please refer to their respective schema definitions for the data model they follow. + +## The dump of ScholeXplorer + + Dataset: https://doi.org/10.5281/zenodo.6338616 + + Schema (Scholix version 3): https://doi.org/10.5281/zenodo.1120275 + + Schema (Scholix version 4): https://doi.org/10.5281/zenodo.6351557 + +This dataset is licensed under a CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. +The dataset contains the GZ-compressed dump of the Scholix links exposed by the OpenAIRE ScholeXplorer service. \ No newline at end of file diff --git a/docs/downloads/subgraphs.md b/docs/downloads/subgraphs.md new file mode 100644 index 0000000..cf5aeb4 --- /dev/null +++ b/docs/downloads/subgraphs.md @@ -0,0 +1,68 @@ +--- +sidebar_position: 3 +--- + +# Sub-graph dumps + +In order to facilitate users, different dumps are available under the Zenodo community called [OpenAIRE Research Graph](https://zenodo.org/communities/openaire-research-graph). +This page lists all alternative dumps currently available. + + +## The OpenAIRE COVID-19 dump + + Dataset: https://doi.org/10.5281/zenodo.6638745 + + Schema: https://doi.org/10.5281/zenodo.6372977 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. + It contains metadata records of publications, research data, software and projects on the topic of Corona Virus and COVID-19. +This dump is part of the activities of OpenAIRE to support the fight against COVID-19 together with the OpenAIRE COVID-19 Gateway. +The dump consists of a tar archive containing gzip files with one json per line. Please refer [here](#alternative-sub-graph-data-model) for details on the data model of this dump. + +## The dump of funded products + + Dataset: https://doi.org/10.5281/zenodo.6634431 + + Schema: https://doi.org/10.5281/zenodo.6372977 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. +It contains metadata records of research products (research literature, data, software, other types of research products) with funding +information available in the OpenAIRE Graph. Records are grouped by funder in a dedicated archive file. Each tar archive contains +gzip files, each with one json record per line. The model of this dump differs from the one of the whole graph. +Please refer [here](#alternative-sub-graph-data-model) for details on the data model of this dump. + +## The dump of delta projects + + Dataset: https://doi.org/10.5281/zenodo.7119633 + + Schema: https://doi.org/10.5281/zenodo.4238938 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. + It contains the metadata records of projects collected by OpenAIRE in a given time frame. Usually one deposition of collected projects is done for each release of the OpenAIRE Graph + The deposition is one tar archive containing gzip files, each with one json record per line. + +## The dumps about research communities, initiatives and infrastructures + + Dataset: https://doi.org/10.5281/zenodo.6638478 + + Schema: https://doi.org/10.5281/zenodo.6372977 + + This dataset is licensed under a Creative Commons Attribution 4.0 International License. +The dataset contains one file per community/initiative/infrastructure collaborating with OpenAIRE. Check out also their community gateways on + CONNECT. Each file is a tar archive containing gzip files with one json per line. The only communities/research initiative/infrastructure we dump are those visible to everyone. + The model of this dump differs from the one of the whole graph. +Please refer [here](#alternative-sub-graph-data-model) for details on the data model of this dump. + + --- + + ## Alternative sub-graph data model + + It should be noted that the dumps for research communities, infrastructures, and products related to projects do not strictly follow the main data model of the OpenAIRE Graph. In particular, they differ in the following: + + * only research products are dumped (no relations, and entities different from results) + * the dumped results are extended with information that can be inferred in the whole dump namely: + * funding information if present + * associated research community/infrastructure + * associated data sources + +So they have just one entity type, that is the [Extended Result](alternative-model/extendedresult.md). diff --git a/sidebars.js b/sidebars.js index 2358d81..aadca39 100644 --- a/sidebars.js +++ b/sidebars.js @@ -51,9 +51,19 @@ const sidebars = { href: "https://graph.openaire.eu/develop/overview.html" }, { - type: 'doc', - id: 'download' - }, + type: 'category', + label: "Downloads", + link: { + type: 'generated-index', + description: 'All resources, available for download, are listed below.' + }, + items: [ + { type: 'doc', id: 'downloads/full-graph'}, + { type: 'doc', id: 'downloads/beginners-kit' }, + { type: 'doc', id: 'downloads/subgraphs' }, + { type: 'doc', id: 'downloads/related-datasets' }, + ] + }, { type: 'category', label: "Data provision", diff --git a/static/bibtex/OpenAIRE_Research_Graph_dump.bib b/static/bibtex/OpenAIRE_Research_Graph_dump.bib new file mode 100644 index 0000000..2bd491d --- /dev/null +++ b/static/bibtex/OpenAIRE_Research_Graph_dump.bib @@ -0,0 +1,33 @@ +@dataset{manghi_paolo_2022_6616871, + author = {Manghi, Paolo and + Atzori, Claudio and + Bardi, Alessia and + Baglioni, Miriam and + Schirrwagen, Jochen and + Dimitropoulos, Harry and + La Bruzzo, Sandro and + Foufoulas, Ioannis and + Mannocci, Andrea and + Horst, Marek and + Czerniak, Andreas and + Kiatropoulou, Katerina and + Kokogiannaki, Argiro and + De Bonis, Michele and + Artini, Michele and + Ottonello, Enrico and + Lempesis, Antonis and + Ioannidis, Alexandros and + Manola, Natalia and + Principe, Pedro}, + title = {OpenAIRE Research Graph Dump}, + month = jun, + year = 2022, + note = {{A new version of this dataset is published every 6 + months. The content available on the OpenAIRE + EXPLORE and CONNECT portals might be more up-to- + date with respect to the data you find here.}}, + publisher = {Zenodo}, + version = {4.1}, + doi = {10.5281/zenodo.6616871}, + url = {https://doi.org/10.5281/zenodo.6616871} +} \ No newline at end of file