From 4b27dd22ae63d4606372dcece9d4d8f40d5d091d Mon Sep 17 00:00:00 2001 From: Serafeim Chatzopoulos Date: Wed, 17 Jan 2024 11:36:15 +0200 Subject: [PATCH] Rename dump to dataset --- docs/apis/specification-changelog.md | 6 +++--- docs/changelog.md | 20 +++++++++---------- .../relationships/relationship-types.md | 2 +- docs/downloads/beginners-kit.md | 8 ++++---- docs/downloads/subgraphs.md | 8 +++----- .../aggregation/non-compatible-sources/ebi.md | 2 +- .../non-compatible-sources/uniprot.md | 2 +- docs/publications.md | 2 +- .../bibtex/OpenAIRE_Research_Graph_dump.bib | 2 +- 9 files changed, 25 insertions(+), 27 deletions(-) diff --git a/docs/apis/specification-changelog.md b/docs/apis/specification-changelog.md index 1b1d6f9..3ce7592 100644 --- a/docs/apis/specification-changelog.md +++ b/docs/apis/specification-changelog.md @@ -10,17 +10,17 @@ | 2022-09-28T20:35:13.116653Z | updated URLs to the broker swagger UI | | 2022-07-28T12:02:06.271154Z | Updated list of funders supported by the API for bulk access to projects: EC Horizon Europe also included | | 2022-05-11T10:01:33.969973Z | New end point for researchProducts in selective access! FOS and SDG classifications available for publication requests | -| 2022-03-29T15:03:29.583536Z | Graph dumps: add new Scholix version 4 | +| 2022-03-29T15:03:29.583536Z | Graph dataset: add new Scholix version 4 | | 2021-11-12T12:04:52.900385Z | originalId parameter added | | 2021-10-18T15:31:18.446582Z | OAI-PMH publisher completely dismissed as announced in January 2021 | | 2021-10-12T07:46:48.032978Z | orcid parameter added in selective access | | 2021-04-08T10:28:02.371361Z | Authenticated requests to our APIs are now enabled. | -| 2021-02-26T16:28:15.364435Z | NEWS: new dump available with research products with project funding information | +| 2021-02-26T16:28:15.364435Z | NEWS: new dataset available with research products with project funding information | | 2021-02-17T07:39:46.051129Z | WIP: broker API documentation | | 2021-02-11T09:06:41.608115Z | Broker API documentation | | 2021-02-10T10:17:39.504429Z | Authentication documentation added + broker card + broker dummy page | | 2021-02-01T08:55:35.496938Z | OAI-PMH shutdown announced for the end of April 2021 | -| 2021-01-15T18:56:04.748404Z | Updated documentation on OpenAIRE Research Graph dumps | +| 2021-01-15T18:56:04.748404Z | Updated documentation on OpenAIRE Research Graph Datasets | | 2021-01-15T16:57:08.569766Z | Announcing the shutdown of the OAI-PMH publisher | | 2019-01-25T15:36:27.264313Z | Added new parameter country for research results | | 2018-10-17T10:39:56.570815Z | Software and Other research products are available via HTTP API. Documentation has been updated. | diff --git a/docs/changelog.md b/docs/changelog.md index 1608338..d0bbf59 100644 --- a/docs/changelog.md +++ b/docs/changelog.md @@ -20,7 +20,7 @@ This section documents all notable changes for each graph version. --- ### v7.0.0 -_Start Date: 2023-12-18 • Release Date: 2024-01-06 • Dump release: **yes**_ +_Start Date: 2023-12-18 • Release Date: 2024-01-06 • Dataset release: **yes**_ #### Added @@ -38,7 +38,7 @@ This graph release also introduces new indicators to identify results published - `result.publicly-funded (true, false)`: indicates whether or not the grants acknowledged by the publication come from public funds. ### v6.2.2 -_Start Date: 2023-11-07 • Release Date: 2023-11-23 • Dump release: **no**_ +_Start Date: 2023-11-07 • Release Date: 2023-11-23 • Dataset release: **no**_ #### Added - Imported Opencitation's POCI dataset, containing citations among publications in PubMed @@ -55,7 +55,7 @@ _Start Date: 2023-11-07 • Release Date: 2023-11-23 • Dump release: **n - Indicators regarding data source downloads and views taken by usage counts from September 2023 ### v6.1.1 -_Start Date: 2023-09-11 • Release Date: 2023-10-15 • Dump release: **no**_ +_Start Date: 2023-09-11 • Release Date: 2023-10-15 • Dataset release: **no**_ #### Added - Affiliation (result to organization) relations from Crossref @@ -71,7 +71,7 @@ _Start Date: 2023-09-11 • Release Date: 2023-10-15 • Dump release: **n - OpenCitations relations from December 2022 ### v6.0.0 -_Start Date: 2023-07-26 • Release Date: 2023-08-16 • Dump release: **yes**_ +_Start Date: 2023-07-26 • Release Date: 2023-08-16 • Dataset release: **yes**_ #### Changed @@ -87,7 +87,7 @@ _Start Date: 2023-07-26 • Release Date: 2023-08-16 • Dump release: **y ### v5.2.0 -_Start Date: 2023-07-03 • Release Date: 2023-07-17 • Dump release: **no**_ +_Start Date: 2023-07-03 • Release Date: 2023-07-17 • Dataset release: **no**_ #### Added - Citations imported from Crossref & MAG @@ -106,7 +106,7 @@ _Start Date: 2023-07-03 • Release Date: 2023-07-17 • Dump release: **n - Avoid duplicated organisation PIDs ### v5.1.3 -_Start Date: 2023-05-22 • Release Date: 2023-06-12 • Dump release: **no**_ +_Start Date: 2023-05-22 • Release Date: 2023-06-12 • Dataset release: **no**_ #### Added - Datasource and project level usage counts @@ -121,7 +121,7 @@ _Start Date: 2023-05-22 • Release Date: 2023-06-12 • Dump release: **n - Deduplication of the datasource ### v5.1.2 -_Start Date: 2023-03-20 • Release Date: 2023-04-04 • Dump release: **no**_ +_Start Date: 2023-03-20 • Release Date: 2023-04-04 • Dataset release: **no**_ #### Changed @@ -132,7 +132,7 @@ _Start Date: 2023-03-20 • Release Date: 2023-04-04 • Dump release: **n - OpenCitations relations from January 2023 ### v5.1.1 -_Start Date: 2023-02-13 • Release Date: 2023-03-01 • Dump release: **no**_ +_Start Date: 2023-02-13 • Release Date: 2023-03-01 • Dataset release: **no**_ #### Added @@ -151,7 +151,7 @@ _Start Date: 2023-02-13 • Release Date: 2023-03-01 • Dump release: **n - OpenCitations relations from December 2022 ### v5.1.0 -_Start Date: 2023-01-16 • Release Date: 2023-01-30 • Dump release: **no**_ +_Start Date: 2023-01-16 • Release Date: 2023-01-30 • Dataset release: **no**_ #### Added @@ -168,7 +168,7 @@ _Start Date: 2023-01-16 • Release Date: 2023-01-30 • Dump release: **n ### v5.0.0 -_Start Date: 2022-12-19 • Release Date: 2022-12-28 • Dump release: **yes**_ +_Start Date: 2022-12-19 • Release Date: 2022-12-28 • Dataset release: **yes**_ #### Added diff --git a/docs/data-model/relationships/relationship-types.md b/docs/data-model/relationships/relationship-types.md index 55378b3..5fc5376 100644 --- a/docs/data-model/relationships/relationship-types.md +++ b/docs/data-model/relationships/relationship-types.md @@ -1,6 +1,6 @@ # Relationship types -The following table lists all the possible relation semantics found in the graph dump. +The following table lists all the possible relation semantics found in the Graph Dataset. Note: the labels used to specify the semantic of the relationships are (for the large) inherited from the [DataCite metadata kernel](https://schema.datacite.org/meta/kernel-4.4/doc/DataCite-MetadataKernel_v4.4.pdf), which provides a description for them. diff --git a/docs/downloads/beginners-kit.md b/docs/downloads/beginners-kit.md index e010752..0dea0a1 100644 --- a/docs/downloads/beginners-kit.md +++ b/docs/downloads/beginners-kit.md @@ -4,13 +4,13 @@ sidebar_position: 2 # Beginner's kit - - The large size of the OpenAIRE Graph is a major impediment for beginners to familiarise with the underlying data model and explore its contents. Working with the Graph in its full size typically requires access to a huge distributed computing infrastructure which cannot be easily accessible to everyone. [The OpenAIRE Beginner’s Kit](https://doi.org/10.5281/zenodo.7490191) aims to address this issue. It consists of two components: + + * A subset of the Graph composed of the research products published between 2022-06-29 and 2022-12-29, all the entities connected to them and the respective relationships. * A Zeppelin notebook that demonstrates how you can use PySpark to analyse the Graph and get answers to some interesting research questions. A guide to Apache Zeppelin can be found [here](https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_zeppelin-component-guide/content/ch_overview.html). \ No newline at end of file diff --git a/docs/downloads/subgraphs.md b/docs/downloads/subgraphs.md index 193e676..ed8125f 100644 --- a/docs/downloads/subgraphs.md +++ b/docs/downloads/subgraphs.md @@ -4,13 +4,11 @@ sidebar_position: 3 # Sub-graph datasets - - In order to facilitate users, different datasets are available under the Zenodo community called [OpenAIRE Graph](https://zenodo.org/communities/openaire-research-graph). This page lists all alternative datasets currently available. - + ## The OpenAIRE COVID-19 dataset diff --git a/docs/graph-production-workflow/aggregation/non-compatible-sources/ebi.md b/docs/graph-production-workflow/aggregation/non-compatible-sources/ebi.md index f5abf7a..c80efd1 100644 --- a/docs/graph-production-workflow/aggregation/non-compatible-sources/ebi.md +++ b/docs/graph-production-workflow/aggregation/non-compatible-sources/ebi.md @@ -69,7 +69,7 @@ curl -s "https://www.ebi.ac.uk/europepmc/webservices/rest/MED/33024307/datalinks ``` ## Mapping -The table below describes the mapping from the EBI links records to the OpenAIRE Graph dump format. +The table below describes the mapping from the EBI links records to the OpenAIRE Graph Dataset format. We filter all the target links with pid type **ena**, **pdb** or **uniprot** For each target we construct a Bioentity with the following mapping diff --git a/docs/graph-production-workflow/aggregation/non-compatible-sources/uniprot.md b/docs/graph-production-workflow/aggregation/non-compatible-sources/uniprot.md index 47fc7bc..9eeff4d 100644 --- a/docs/graph-production-workflow/aggregation/non-compatible-sources/uniprot.md +++ b/docs/graph-production-workflow/aggregation/non-compatible-sources/uniprot.md @@ -7,7 +7,7 @@ From this dataset, only the protein records linked to a PubMed publication are e ## Entity Mapping -The table below describes the mapping from the TEXT metadata format to the OpenAIRE Graph dump format. +The table below describes the mapping from the TEXT metadata format to the OpenAIRE Graph Dataset format. You can check an example of the text metadata [here](https://rest.uniprot.org/uniprotkb/A0A0C5B5G6.txt) | OpenAIRE Result field path | FASTA record field xpath | Notes | diff --git a/docs/publications.md b/docs/publications.md index c35ac13..0351065 100644 --- a/docs/publications.md +++ b/docs/publications.md @@ -4,7 +4,7 @@ sidebar_position: 7 # Relevant publications -Open Science services are open and transparent and survive thanks to your active support and to the visibility and reward they gather. If you use one of the [OpenAIRE Graph dumps](https://doi.org/10.5281/zenodo.3516917) for your research, please provide a proper citation following the recommendation that you find on the dump's Zenodo page or as provided below. +Open Science services are open and transparent and survive thanks to your active support and to the visibility and reward they gather. If you use one of the [OpenAIRE Graph Datasets](https://doi.org/10.5281/zenodo.3516917) for your research, please provide a proper citation following the recommendation that you find on the dataset's Zenodo page or as provided below. :::note How to cite diff --git a/static/bibtex/OpenAIRE_Research_Graph_dump.bib b/static/bibtex/OpenAIRE_Research_Graph_dump.bib index 5bfc9d2..3de6fa9 100644 --- a/static/bibtex/OpenAIRE_Research_Graph_dump.bib +++ b/static/bibtex/OpenAIRE_Research_Graph_dump.bib @@ -21,7 +21,7 @@ Vergoulis, Thanasis and Chatzopoulos, Serafeim and Pierrakos, Dimitris}, - title = {OpenAIRE Research Graph Dump}, + title = {OpenAIRE Graph Dataset}, month = dec, year = 2022, note = {{A new version of this dataset is published every 6