Update 'docs/publications.md'

Add changelog.md; add version in navbar; update readme
Static sidebar && add publications
2022-10-12 12:21:14 +02:00 · 2022-09-29 19:00:42 +03:00 · 2022-09-23 19:00:46 +03:00 · 2022-09-23 17:19:32 +03:00 · 2022-09-22 21:21:18 +03:00
25 changed files with 1464 additions and 367 deletions
--- a/README.md
+++ b/README.md
@ -2,30 +2,41 @@

 This website is built using [Docusaurus 2](https://docusaurus.io/); please check [here](https://docusaurus.io/docs/installation#requirements) the requirements to run the project.

-### Clone repository
+## Clone repository
 ```
 $ git clone https://code-repo.d4science.org/D-Net/openaire-graph-docs.git
 ```

-### Installation
+## Local installation and deployment

+To install the required packages use:
 ```
 $ npm install
 ```

-### Local Development
-
+The following command starts a local development server and opens up a browser window. Note that most changes are reflected live without having to restart the server.
 ```
 $ npm run start
 ```

-This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
-
-### Build and deploy
+Generate the static content into the `build` directory using the command tha follows. Then this directory can be served using any static contents hosting service.

 ```
 $ npm run build
 ```

-This command generates static content into the `build` directory and can be served using any static contents hosting service.
+## Deployment using Docker
+TODO

+## Documentation versioning 
+The versioning documentation of Docusaurus can be found [here](https://docusaurus.io/docs/versioning).
+Specifically, a new version can be created with the following command: 
+```
+npm run docusaurus docs:version <versionName>
+```
+
+When tagging a new version, the document versioning mechanism will:
+
+* Copy the full `docs/` folder contents into a new `versioned_docs/version-<versionName>/` folder.
+* Create a versioned sidebars file based from your current sidebar configuration, saved as `versioned_sidebars/version-<versionName>-sidebars.json`.
+* Append the new version number to `versions.json`.
--- a/docs/changelog.md
+++ b/docs/changelog.md
@ -0,0 +1,6 @@
+---
+sidebar_position: 12
+---
+
+# Changelog
+<span className="todo">TODO</span>
--- a/docs/data-model/data-model.md
+++ b/docs/data-model/data-model.md
@ -15,12 +15,9 @@ Its main entities are described in brief below:
 * [Data Sources](entities/data-source) are the resources used to collect metadata for the graph objects
 * [Organizations](entities/organization) correspond to companies or research institutions involved in projects,
 responsible for operating data sources or consisting the affiliations of Product creators.
-<!-- * [Funders](entities/funder) (e.g. EC, Wellcome Trust) are agencies responsible for a list of Funding Streams. -->
-<!-- * [Funding Streams](entities/funding-stream) represent investments (funding actions) from Funders (e.g. FP7 or H2020). -->
 * [Projects](entities/project) are research projects funded by a Funding Stream of a Funder.
 * [Communities](entities/community) are groups of people with a common research intent.

-
 :::note Further reading

 A detailed report on the OpenAIRE Research Graph Data Model can be found on [Zenodo](https://zenodo.org/record/2643199).
--- a/docs/data-model/entities/community.md
+++ b/docs/data-model/entities/community.md
@ -2,7 +2,81 @@
 sidebar_position: 6
 ---

+# Community

-# Community (Initiative)
-<span className="todo">TODO</span>
+Research communities and research initiatives are intended as groups of people with a common research intent and can be of two types: research initiatives or research communities:

+* Research initiatives are intended to capture a view of the information space that is "research impact"-oriented, i.e. all products generated due to my research initiative;
+* Research communities the latter “research activity” oriented, i.e. all products that may be of interest or related to my research initiative.
+
+For example, the organizations supporting a research infrastructure fall in the first category, while the researchers involved in a discipline fall in the second.
+
+## The `Community` object 
+
+### id
+_Type: String &bull; Cardinality: ONE_
+
+The OpenAIRE id for the community/research infrastructure, created according to the [OpenAIRE entity identifier and PID mapping policy](entity-identifiers).
+
+```json
+"id": "00|context_____::5b7f9fa40bdc12072249204cedfa7808"
+```
+
+### acronym
+_Type: String &bull; Cardinality: ONE_
+
+The acronym of the community.
+
+```json
+"acronym": "covid-19"
+```
+
+### description
+_Type: String &bull; Cardinality: ONE_
+
+Description of the research community/research infrastructure
+
+```json
+"description": "This portal provides access to publications, research data, projects and software that may be relevant to the Corona Virus Disease (COVID-19). The OpenAIRE COVID-19 Gateway aggregates COVID-19 related records, links them and provides a single access point for discovery and navigation. We tag content from the OpenAIRE Research Graph (10,000+ data sources) and additional sources. All COVID-19 related research results are linked to people, organizations and projects, providing a contextualized navigation."
+```
+
+### name
+_Type: String &bull; Cardinality: ONE_
+
+The long name of the community.
+
+```json
+"name": "Corona Virus Disease"
+```
+
+### subject
+_Type: String &bull; Cardinality: MANY_
+
+The list of the subjects associated to the research community (only appies to research communities).
+
+```json
+"subject": [
+    "COVID19",
+    "SARS-CoV",
+    "HCoV-19",
+    ...
+]
+```
+
+### type
+_Type: String &bull; Cardinality: ONE_
+
+The type of the community; one of `{ Research Community, Research infrastructure }`.
+
+```json
+"type": "Research Community"
+```
+
+### zenodo_community
+_Type: String &bull; Cardinality: ONE_
+
+The URL of the Zenodo community associated to the Research community/Research infrastructure.
+
+```json
+"zenodo_community": "https://zenodo.org/communities/covid-19"
+```
--- a/docs/data-model/entities/data-source.md
+++ b/docs/data-model/entities/data-source.md
@ -4,10 +4,9 @@ sidebar_position: 2

 # Data source

-OpenAIRE entity instances are created out of data collected from various data sources of different kinds, such as publication repositories, dataset archives, CRIS systems, funder databases, etc. Data sources export information packages (e.g., XML records, HTTP responses, RDF data, JSON) that may contain information on one or more of such entities and possibly relationships between them. For example, a metadata record about a project carries information for the creation of a Project entity and its participants (as Organization entities). It is important, once each piece of information is extracted from such packages and inserted into the OpenAIRE information space as an entity, for such pieces to keep provenance information relative to the originating data source. This is to give visibility to the data source, but also to enable the reconstruction of the very same piece of information if problems arise.
+OpenAIRE entity instances are created out of data collected from various data sources of different kinds, such as publication repositories, dataset archives, CRIS systems, funder databases, etc. Data sources export information packages (e.g., XML records, HTTP responses, RDF data, JSON) that may contain information on one or more of such entities and possibly relationships between them. 

-
-<span className="todo">Definitions for the re3data specific elements from: https://gfzpublic.gfz-potsdam.de/rest/items/item_758898_6/component/file_775891/content</span>
+For example, a metadata record about a project carries information for the creation of a Project entity and its participants (as Organization entities). It is important, once each piece of information is extracted from such packages and inserted into the OpenAIRE information space as an entity, for such pieces to keep provenance information relative to the originating data source. This is to give visibility to the data source, but also to enable the reconstruction of the very same piece of information if problems arise.

 --- 

@ -16,127 +15,280 @@ OpenAIRE entity instances are created out of data collected from various data so
 ### id
 _Type: String &bull; Cardinality: ONE_

-Main entity identifier, created according to [OpenAIRE_entity_identifier_and_PID_mapping_policy](https://support.openaire.eu/projects/docs/wiki/OpenAIRE_entity_identifier_and_PID_mapping_policy).
+The OpenAIRE id of the data source, created according to the [OpenAIRE entity identifier and PID mapping policy](entity-identifiers).
+
+```json
+"id": "10|issn___print::22c514d022b199c346e7f29ca06efc95"
+```

 ### originalId
 _Type: String &bull; Cardinality: MANY_

-The list of original ids associated to the datasource.
+The list of original identifiers associated to the datasource.
+
+```json
+"originalId": [
+    "issn___print::2451-8271",
+    ...
+]
+```

 ### pid
+
 _Type: [ControlledField](other#controlledfield) &bull; Cardinality: MANY_

 The persistent identifiers for the datasource.

+```json
+"pid": [
+    {
+        "scheme": "DOI",
+        "value": "10.5281/zenodo.4707307" 
+    },
+    ...
+]
+```
+
 ### datasourcetype
 _Type: [ControlledField](other#controlledfield) &bull; Cardinality: ONE_

-The datasource type (e.g. pubsrepository::institutional, Institutional Repository) as in the vocabulary [dnet:datasource_typologies](https://api.openaire.eu/vocabularies/dnet:datasourceCompatibilityLevel).
+The datasource type; see the vocabulary [dnet:datasource_typologies](https://api.openaire.eu/vocabularies/dnet:datasource_typologies).
+
+```json
+"datasourcetype": {
+    "scheme": "pubsrepository::journal",
+    "value": "Journal"
+}
+```

 ### openairecompatibility
 _Type: String &bull; Cardinality: ONE_

-The OpenAIRE compatibility of the ingested results, indicates which guidelines they are compliant to the vocabulary [dnet:datasourceCompatibilityLevel](https://api.openaire.eu/vocabularies/dnet:datasourceCompatibilityLevel).
+The OpenAIRE compatibility of the ingested results, indicates which guidelines they are compliant according to the vocabulary [dnet:datasourceCompatibilityLevel](https://api.openaire.eu/vocabularies/dnet:datasourceCompatibilityLevel).
+
+```json
+"openairecompatibility": "collected from a compatible aggregator"
+```

 ### officialname
 _Type: String &bull; Cardinality: ONE_

 The official name of the datasource.

+```json
+"officialname": "Recent Patents and Topics on Medical Imaging"
+```
+
 ### englishname
 _Type: String &bull; Cardinality: ONE_

 The English name of the datasource.

+```json
+"englishname": "Recent Patents and Topics on Medical Imaging"
+```
+
 ### websiteurl
 _Type: String &bull; Cardinality: ONE_

 The URL of the website of the datasource.

+```json
+"websiteurl": "http://dspace.unict.it/"
+```
+
 ### logourl
 _Type: String &bull; Cardinality: ONE_

 The URL of the logo for the datasource.

+```json
+"logourl": "https://impactum-journals.uc.pt/public/journals/26/pageHeaderLogoImage_en_US.png"
+```
+
 ### dateofvalidation
 _Type: String &bull; Cardinality: ONE_

-The date of validation against the guidelines for the datasource records.
+The date of validation against the OpenAIRE guidelines for the datasource records.
+
+```json
+"dateofvalidation": "2016-10-10"
+```

 ### description
 _Type: String &bull; Cardinality: ONE_

 The description for the datasource.

-### subjects
-_Type: String &bull; Cardinality: ONE_
+```json
+"description": "Recent Patents on Medical Imaging publishes review and research articles, and guest edited single-topic issues on recent patents in the field of medical imaging. It provides an important and reliable source of current information on developments in the field. The journal is essential reading for all researchers involved in Medical Imaging."
+```

-The subjects of the contents provided by the datasource.
+### subjects
+_Type: String &bull; Cardinality: MANY_
+
+List of subjects associated to the datasource
+
+```json
+"subjects": [
+    "Medicine",
+    "Imaging",
+    ...
+]
+```

 ### languages
 _Type: String &bull; Cardinality: MANY_

-The languages of the contents provided by the datasource (OpenDOAR only).
+The languages present in the data source's content, as defined by OpenDOAR.
+
+```json
+"languages":[ 
+    "eng",
+    ...
+]
+```

 ### contenttypes
 _Type: String &bull; Cardinality: MANY_

-The typologies of the contents provided by the datasource (OpenDOAR only).
+Types of content in the data source, as defined by OpenDOAR
+
+```json
+"contenttypes": [
+    "Journal articles",
+    ...
+]
+```

 ### releasestartdate
 _Type: String &bull; Cardinality: ONE_

-<span className="todo">TODO</span>
+Releasing date of the data source, as defined by re3data.org.
+
+```json
+"releasestartdate": "2010-07-24"
+```

 ### releaseenddate
 _Type: String &bull; Cardinality: ONE_

-<span className="todo">TODO</span>
+Date when the data source went offline or stopped ingesting new research data. As defined by re3data.org
+
+```json
+"releaseenddate": "2016-03-28"
+```

 ### accessrights
 _Type: String &bull; Cardinality: ONE_

-Open, restricted or closed.
+Type of access to the data source, as defined by re3data.org. Possible values: `{ open, restricted, closed }`.
+
+```json
+"accessrights": "open"
+```

 ### uploadrights
 _Type: String &bull; Cardinality: ONE_

-Open, restricted or closed.
+Type of data upload, as defined by re3data.org; one of `{ open, restricted, closed }`.
+
+```json
+"uploadrights": "closed"
+```

 ### databaseaccessrestriction
 _Type: String &bull; Cardinality: ONE_

-All existing access restrictions to the research data repository. Allowed values are: feeRequired, registration, other (re3data only).
+Access restrictions to the research data repository. Allowed values are: `{ feeRequired, registration, other }`.
+
+This field only applies for re3data data source; see [re3data schema specification](https://gfzpublic.gfz-potsdam.de/rest/items/item_758898_6/component/file_775891/content) for more details.
+
+```json
+"databaseaccessrestriction": "registration"
+```

 ### datauploadrestriction
 _Type: String &bull; Cardinality: ONE_

-All existing restrictions to the data upload. (re3data only).
+Upload restrictions applied by the datasource, as defined by re3data.org. One of `{ feeRequired, registration, other }`.
+
+This field only applies for re3data data source; see [re3data schema specification](https://gfzpublic.gfz-potsdam.de/rest/items/item_758898_6/component/file_775891/content) for more details.
+
+```json
+"datauploadrestriction": "feeRequired registration"
+```

 ### versioning
 _Type: Boolean &bull; Cardinality: ONE_

-The research data repository supports versioning of research data. (re3data only).
+Whether the research data repository supports versioning:
+`yes` if the data source supports versioning, `no` otherwise.
+
+This field only applies for re3data data source; see [re3data schema specification](https://gfzpublic.gfz-potsdam.de/rest/items/item_758898_6/component/file_775891/content) for more details.
+
+```json
+"versioning": true
+```

 ### citationguidelineurl
 _Type: String &bull; Cardinality: ONE_

-The URL of the research data repository providing information on how to cite its research data. The DataCite citation format is recommended (http://www.datacite.org/whycitedata). (re3data only)
+The URL of the data source providing information on how to cite its items. The DataCite citation format is recommended (http://www.datacite.org/whycitedata). 
+
+This field only applies for re3data data source; see [re3data schema specification](https://gfzpublic.gfz-potsdam.de/rest/items/item_758898_6/component/file_775891/content) for more details.
+
+```json
+"citationguidelineurl": "https://physionet.org/about/#citation"
+```

 ### pidsystems
 _Type: String &bull; Cardinality: ONE_

+The persistent identifier system that is used by the data source. As defined by re3data.org.
+
+```json
+"pidsystems": "hdl"
+```
+
 ### certificates
 _Type: String &bull; Cardinality: ONE_

-<span className="todo">TODO</span>
+The certificate, seal or standard the data source complies with. As defined by re3data.org.
+
+```json
+"certificates": "WDS"
+```

 ### policies
 _Type: String &bull; Cardinality: MANY_

-<span className="todo">TODO</span>
+Policies of the data source, as defined in OpenDOAR.

 ### journal
 _Type: [Container](other#container) &bull; Cardinality: ONE_

-<span className="todo">TODO</span>
+Information about the journal, if this data source is of type Journal.
+
+```json
+"container": {
+    "edition": "",
+    "iss": "5",
+    "issnLinking": "",
+    "issnOnline": "1873-7625",
+    "issnPrinted":"2451-8271",
+    "name": "Recent Patents and Topics on Imaging",
+    "sp": "12",
+    "ep": "22",
+    "vol": "50"
+}
+```
+
+### missionstatementurl
+_Type: String &bull; Cardinality: ONE_
+
+The URL of a mission statement describing the designated community of the data source. As defined by re3data.org
+
+```json
+"missionstatementurl": "https://www.sigma2.no/content/nird-research-data-archive"
+```
--- a/docs/data-model/entities/entity-identifiers.md
+++ b/docs/data-model/entities/entity-identifiers.md
@ -4,5 +4,36 @@ sidebar_position: 8

 # OpenAIRE entity identifier and PID mapping policy

-https://support.openaire.eu/projects/docs/wiki/OpenAIRE_entity_identifier_and_PID_mapping_policy
-<span className="todo">TODO: include this here? it referenced by many other pages</span>
+OpenAIRE assigns internal identifiers for each object it collects.
+By default, the internal identifier is generated as `sourcePrefix::md5(localId)` where:
+
+* `sourcePrefix` is a namespace prefix of 12 chars assigned to the data source at registration time
+* `localid` is the identifier assigned to the object by the data source
+
+After years of operation, we can say that:
+
+* `localId` are unstable
+* objects can disappear from sources
+* PIDs provided by sources that are not PID agencies (authoritative sources for a specific type of PID) are often wrong (e.g. pre-print with the DOI of the published version, DOIs with typos)
+
+Therefore, when the record is collected from an authoritative source:
+
+* the identity of the record is forged using the PID, like `pidTypePrefix::md5(lowercase(doi))`
+* the PID is added in a `pid` element of the data model
+
+When the record is collected from a source which is not authoritative for any type of PID:
+* the identity of the record is forged as usual using the local identifier
+* the PID, if available, is added as `alternateIdentifier`
+
+Currently, the following data sources are used as "PID authorities":
+
+| PID Type 	| Prefix (12 chars) 	| Authority                             	|
+|----------	|-------------------	|---------------------------------------	|
+| doi      	| `doi_________`      	| Crossref, Datacite, Zenodo            	|
+| pmc      	| `pmc_________`      	| Europe PubMed Central, PubMed Central 	|
+| pmid     	| `pmid________`      	| Europe PubMed Central, PubMed Central 	|
+| arXiv    	| `arXiv_______`      	| arXiv.org e-Print Archive             	|
+| handle   	| `handle______`      	| any repository                        	|
+
+OpenAIRE also perform duplicate identification (see the [dedicated section for details](../../data-provision/deduplication/)).
+All duplicates are **merged** together in a **representative record** which must be assigned a dedicated OpenAIRE identifier (i.e. it cannot have the identifier of one of the aggregated record).
--- a/docs/data-model/entities/organization.md
+++ b/docs/data-model/entities/organization.md
@ -14,35 +14,81 @@ Organizations include companies, research centers or institutions involved as pr
 ### id
 _Type: String &bull; Cardinality: ONE_

-Main entity identifier, created according to [OpenAIRE_entity_identifier_and_PID_mapping_policy](https://support.openaire.eu/projects/docs/wiki/OpenAIRE_entity_identifier_and_PID_mapping_policy).
+The OpenAIRE id for the organization, created according to the [OpenAIRE entity identifier and PID mapping policy](entity-identifiers).
+
+```json
+"id": "20|openorgs____::b84450f9864182c67b8611b5593f4250"
+```

 ### legalshortname
 _Type: String &bull; Cardinality: ONE_

 The legal name in short form of the organization.

+```json
+"legalshortname": "ARC"
+```
+
 ### legalname
 _Type: String &bull; Cardinality: ONE_

 The legal name of the organization.

+```json
+"legalname": "Athena Research and Innovation Center In Information Communication & Knowledge Technologies"
+```
+
 ### alternativenames
 _Type: String &bull; Cardinality: MANY_

-The alternative names of the organization.
+Alternative names that identify the organization.
+
+```json
+"alternativenames": [
+    "Athena Research and Innovation Center In Information Communication & Knowledge Technologies",
+    "Athena RIC",
+    "ARC",
+    ...
+]
+```

 ### websiteurl
 _Type: String &bull; Cardinality: ONE_

 The websiteurl of the organization.

+```json
+"websiteurl": "https://www.athena-innovation.gr/el/announce/pressreleases.html"
+```
+
 ### country
 _Type: [Country](other#country) &bull; Cardinality: ONE_

 The country where the organization is located.

+```json
+"country":{
+    "code": "GR",
+    "label": "Greece"
+}
+```
+
 ### pid
 _Type: [OrganizationPid](other#organizationpid) &bull; Cardinality: MANY_

 The list of persistent identifiers for the organization.

+```json
+"pid": [
+    {
+        "scheme": "ISNI",
+        "value": "0000 0004 0393 5688"
+    },
+    { 
+        "scheme": "GRID",
+        "value":
+        "grid.19843.37"
+    },
+    ...
+]
+```
--- a/docs/data-model/entities/other.md
+++ b/docs/data-model/entities/other.md
--- a/docs/data-model/entities/project.md
+++ b/docs/data-model/entities/project.md
@ -13,19 +13,159 @@ Of crucial interest to OpenAIRE is also the identification of the funders (e.g.
 ### id 
 _Type: String &bull; Cardinality: ONE_

-Main entity identifier, created according to [OpenAIRE_entity_identifier_and_PID_mapping_policy](https://support.openaire.eu/projects/docs/wiki/OpenAIRE_entity_identifier_and_PID_mapping_policy).
+Main entity identifier, created according to the [OpenAIRE entity identifier and PID mapping policy](entity-identifiers).
+
+```json
+"id": "40|corda__h2020::70ea22400fd890c5033cb31642c4ae68"
+```

 ### code
 _Type: String &bull; Cardinality: ONE_

 Τhe grant agreement code of the project.

+```json
+"code": "777541"
+```
+
 ### acronym
 _Type: String &bull; Cardinality: ONE_

 Project's acronym.

+```json
+"acronym": "OpenAIRE-Advance"
+```
+
 ### title
 _Type: String &bull; Cardinality: ONE_

 Project's title.
+
+```json
+"title": "OpenAIRE Advancing Open Scholarship"
+```
+
+### callidentifier
+_Type: String &bull; Cardinality: ONE_
+
+The identifier of the research call.
+
+```json
+"callidentifier": "H2020-EINFRA-2017"`
+``` 
+
+### funding
+_Type: [Funding](other#funding) &bull; Cardinality: MANY_
+
+Funding information for the project.
+
+```json
+"funding": [
+    {
+        "funding_stream": {
+            "description": "Horizon 2020 Framework Programme - Research and Innovation action",
+            "id": "EC::H2020::RIA"
+        },
+        "jurisdiction": "EU",
+        "name": "European Commission",
+        "shortName": "EC"
+    }
+]
+```
+### granted
+_Type: [Grant](other#grant) &bull; Cardinality: ONE_
+
+The money granted to the project.
+
+```json
+"granted": {
+    "currency": "EUR",
+    "fundedamount": 1.0E7,
+    "totalcost": 1.0E7
+}
+```
+
+### h2020programme
+_Type: [H2020Programme](other#h2020programme) &bull; Cardinality: MANY_
+
+The H2020 programme funding the project.
+
+```json
+"h2020programme":[
+    {
+        "code": "H2020-EU.1.4.1.3.",
+        "description": "Development, deployment and operation of ICT-based e-infrastructures"
+    }
+]
+```
+### keywords
+_Type: String &bull; Cardinality: ONE_
+
+```json
+"keywords": [
+    "Open Science",
+    ...
+]
+```
+
+### openaccessmandatefordataset
+_Type: Boolean &bull; Cardinality: ONE_
+
+```json
+"openaccessmandatefordataset": true
+```
+
+### openaccessmandateforpublications
+_Type: Boolean &bull; Cardinality: ONE_
+
+```json
+"openaccessmandateforpublications": true
+```
+
+### startdate
+_Type: String &bull; Cardinality: ONE_
+
+The start year of the project.
+
+```json
+"startdate": "2018-01-01"
+```
+
+### enddate
+_Type: String &bull; Cardinality: ONE_
+
+The end year pf the project.
+
+```json
+"enddate": "2021-02-28"
+```
+
+### subject
+_Type: String &bull; Cardinality: MANY_
+
+The subjects of the project
+
+```json
+"subject": [
+    "Data and Distributed Computing e-infrastructures for Open Science",
+    ...
+]
+```
+### summary
+_Type: String &bull; Cardinality: ONE_
+
+Short summary of the project.
+
+```json
+"summary": "OpenAIRE-Advance continues the mission of OpenAIRE to support the Open Access/Open Data mandates in Europe. By sustaining the current successful infrastructure, comprised of a human network and robust technical services, it consolidates its achievements while working to shift the momentum among its communities to Open Science, aiming to be a trusted e-Infrastructurewithin the realms of the European Open Science Cloud.In this next phase, OpenAIRE-Advance strives to empower its National Open Access Desks (NOADs) so they become a pivotal part within their own national data infrastructures, positioningOA and open science onto national agendas. The capacity building activities bring together experts ontopical task groups in thematic areas(open policies, RDM, legal issues, TDM), promoting a train the trainer approach, strengthening and expanding the pan-European Helpdesk with support and training toolkits, training resources and workshops.It examines key elements of scholarly communication, i.e., co-operative OA publishing and next generation repositories, to develop essential building blocks of the scholarly commons.On the technical level OpenAIRE-Advance focuses on the operation and maintenance of the OpenAIRE technical TRL8/9 services,and radically improvesthe OpenAIRE services on offer by: a) optimizing their performance and scalability, b) refining their functionality based on end-user feedback, c) repackagingthem into products, taking a professional marketing approach  with well-defined KPIs, d)consolidating the range of services/products into a common e-Infra catalogue to enable a wider uptake.OpenAIRE-Advancesteps up its outreach activities with concrete pilots with three major RIs,citizen science initiatives, and innovators via a rigorous Open Innovation programme. Finally, viaits partnership with COAR, OpenAIRE-Advance consolidatesOpenAIRE’s global roleextending its collaborations with Latin America, US, Japan, Canada, and Africa."
+```
+
+### websiteurl
+_Type: String &bull; Cardinality: ONE_
+
+The website of the project
+
+```json
+"websiteurl": "https://www.openaire.eu/advance/"
+```
--- a/docs/data-model/entities/result.md
+++ b/docs/data-model/entities/result.md
@ -20,8 +20,11 @@ Moreover, there are the following sub-types of a `Result`, that inherit all its
 ### id
 _Type: String &bull; Cardinality: ONE_

-Main entity identifier, created according to 
-<span className="todo">[OpenAIRE entity identifier and PID mapping policy](https://support.openaire.eu/projects/docs/wiki/OpenAIRE_entity_identifier_and_PID_mapping_policy)</span>.
+Main entity identifier, created according to the [OpenAIRE entity identifier and PID mapping policy](entity-identifiers).
+
+```json
+"id": "50|doi_dedup___::80f29c8c8ba18c46c88a285b7e739dc3"
+```

 ### type
 _Type: String  &bull; Cardinality: ONE_
@ -35,102 +38,293 @@ Type of the result. Possible types:

 as declared in the terms from the [dnet:result_typologies vocabulary](https://api.openaire.eu/vocabularies/dnet:result_typologies).

+```json
+"type": "publication"
+```
+
 ### originalId
 _Type: String &bull; Cardinality: MANY_

 Identifiers of the record at the original sources.

+```json
+"originalId": [
+    "oai:pubmedcentral.nih.gov:8024784",
+    "S0048733321000305",
+    "10.1016/j.respol.2021.104226",
+    "3136742816"
+]
+```
+
 ### maintitle
 _Type: String &bull; Cardinality: ONE_

 A name or title by which a scientific result is known. May be the title of a publication, of a dataset or the name of a piece of software.

+```json
+"maintitle": "The fall of the innovation empire and its possible rise through open science"
+```
+
 ### subtitle
+
 _Type: String &bull; Cardinality: ONE_

 Explanatory or alternative name by which a scientific result is known.

+```json
+"subtitle": "An analysis of cases from 1980 - 2020"
+```
+
 ### author
 _Type: [Author](other#author) &bull; Cardinality: MANY_

 The main researchers involved in producing the data, or the authors of the publication.

+```json
+"author": [
+    {
+        "fullname": "E. Richard Gold",
+        "rank": 1, 
+        "name": "Richard",
+        "surname": "Gold",
+        "pid": {
+            "id": {
+                "scheme": "orcid",
+                "value": "0000-0002-3789-9238" 
+            },
+            "provenance"; {
+                "provenance": "Harvested",
+                "trust": "0.9" 
+            }
+        }
+    }, 
+    ...
+]
+```
 ### bestaccessright
 _Type: [BestAccessRight](other#bestaccessright) &bull; Cardinality: ONE_

 The most open access right associated to the manifestations of this research results.

+```json
+"bestaccessright": {
+    "code": "c_abf2",
+    "label": "OPEN",
+    "scheme": "http://vocabularies.coar-repositories.org/documentation/access_rights/"
+}
+```
+
 ### contributor
 _Type: String &bull; Cardinality: MANY_

 The institution or person responsible for collecting, managing, distributing, or otherwise contributing to the development of the resource.

+```json
+"contributor": [
+    "University of Zurich",
+    "Wright, Aidan G C",
+    "Hallquist, Michael", 
+    ...
+]
+```
+
 ### country
 _Type: [ResultCountry](other#resultcountry) &bull; Cardinality: MANY_

 Country associated with the result because it is the country of the organisation that manages the institutional repository or national aggregator or CRIS system from which this record was collected
 Country of affiliations of authors can be found instead in the affiliation rel.

+```json
+"country": [
+    {
+        "code": "CH",
+        "label": "Switzerland",
+        "provenance": {
+            "provenance": "Inferred by OpenAIRE",
+            "trust": "0.85"
+        }
+    }, 
+    ...
+]
+```
+
 ### coverage
 _Type: String &bull; Cardinality: MANY_
-<span className="todo">TODO</span>

 ###  dateofcollection
 _Type: String &bull; Cardinality: ONE_

 When OpenAIRE collected the record the last time. 
-<span className="todo">TODO: we should indicate the used date format</span>
+
+```json
+"dateofcollection": "2021-06-09T11:37:56.248Z"
+```

 ### description
 _Type: String &bull; Cardinality: MANY_

 A brief description of the resource and the context in which the resource was created.

+```json
+"description": [
+    "Open science partnerships (OSPs) are one mechanism to reverse declining efficiency. OSPs are public-private partnerships that openly share publications, data and materials.",
+    "There is growing concern that the innovation system's ability to create wealth and attain social benefit is declining in effectiveness. This article explores the reasons for this decline and suggests a structure, the open science partnership, as one mechanism through which to slow down or reverse this decline.",
+    "The article examines the empirical literature of the last century to document the decline. This literature suggests that the cost of research and innovation is increasing exponentially, that researcher productivity is declining, and, third, that these two phenomena have led to an overall flat or declining level of innovation productivity.", 
+    ...
+]
+```
+
 ### embargoenddate
 _Type: String &bull; Cardinality: ONE_

-Date when the embargo ends and this result turns Open Access. <span className="todo">TODO: we should indicate the used date format</span>
+Date when the embargo ends and this result turns Open Access.
+
+```json
+"embargoenddate": "2017-01-01"
+```

 ### instance
 _Type: [Instance](other#instance) &bull; Cardinality: MANY_

-Specific materialization or version of the result. For example, you can have one result with three instances: one is the pre-print, one is the post-print, one is the published version
+Specific materialization or version of the result. For example, you can have one result with three instances: one is the pre-print, one is the post-print, one is the published version.
+
+```json
+"instance": [
+    {
+        "accessright": {
+            "code": "c_abf2",
+            "label": "OPEN",
+            "openAccessRoute": "gold",
+            "scheme": "http://vocabularies.coar-repositories.org/documentation/access_rights/"
+        },
+        "alternateIdentifier": [
+            {
+                "scheme": "doi",
+                "value": "10.1016/j.respol.2021.104226"
+            },
+            ...
+        ],
+        "articleprocessingcharge": {
+            "amount": "4063.93",
+            "currency": "EUR"
+        },
+        "license": "http://creativecommons.org/licenses/by-nc/4.0",
+        "measures":[
+            { 
+                "key": "influence",
+                "value": "6.45335454246e-09"
+            },
+            ...
+        ],
+        "pid": [
+            {
+                "scheme": "pmc",
+                "value": "PMC8024784"
+            },
+            ...
+        ],
+        
+        "publicationdate": "2021-01-01",
+        "refereed": "UNKNOWN",
+        "type": "Article",
+        "url": [
+            "http://europepmc.org/articles/PMC8024784"
+        ]
+    },
+    ...
+]
+```

 ### language
 _Type: [Language](other#language) &bull; Cardinality: ONE_

-The `alpha-3/ISO 639-2` code of the language. Values controlled by the [dnet:languages vocabulary](https://api.openaire.eu/vocabularies/dnet:languages)
+The alpha-3/ISO 639-2 code of the language. Values controlled by the [dnet:languages vocabulary](https://api.openaire.eu/vocabularies/dnet:languages).

+```json
+"language": {
+    "code": "eng",
+    "label": "English"
+}
+```
 ### lastupdatetimestamp
 _Type: Long &bull; Cardinality: ONE_

 Timestamp of last update of the record in OpenAIRE.

+```json
+"lastupdatetimestamp": 1652722279987
+```
+
 ### pid
 _Type: [ResultPid](other#resultpid) &bull; Cardinality: MANY_

-Persistent identifiers of the result. See also <span className="todo">[OpenAIRE entity identifier and PID mapping policy](https://support.openaire.eu/projects/docs/wiki/OpenAIRE_entity_identifier_and_PID_mapping_policy)</span> to learn more.
+Persistent identifiers of the result. See also the [OpenAIRE entity identifier and PID mapping policy](entity-identifiers) to learn more.
+
+```json
+"pid": [
+    {
+        "scheme": "pmc",
+        "value": "PMC8024784"
+    },
+    {
+        "scheme": "doi",
+        "value": "10.1016/j.respol.2021.104226"
+    },
+    ...
+]
+```

 ### publicationdate
 _Type: String &bull; Cardinality: ONE_

 Main date of the research product: typically the publication or issued date. In case of a research result with different versions with different dates, the date of the result is selected as the most frequent well-formatted date. If not available, then the most recent and complete date among those that are well-formatted. For statistics, the year is extracted and the result is counted only among the result of that year. Example: Pre-print date: 2019-02-03, Article date provided by repository: 2020-02, Article date provided by Crossref: 2020, OpenAIRE will set as date 2019-02-03, because it’s the most recent among the complete and well-formed dates. If then the repository updates the metadata and set a complete date (e.g. 2020-02-12), then this will be the new date for the result because it becomes the most recent most complete date. However, if OpenAIRE then collects the pre-print from another repository with date 2019-02-03, then this will be the “winning date” because it becomes the most frequent well-formatted date.

+```json
+"publicationdate": "2021-03-18"
+```
+
 ### publisher
 _Type: String &bull; Cardinality: ONE_

 The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource.

+```json
+"publisher": "Elsevier, North-Holland Pub. Co"
+```
+
 ### source
 _Type: String &bull; Cardinality: MANY_

 A related resource from which the described resource is derived. See definition of Dublin Core field [dc:source](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/elements11/source).

+```json
+"source": [
+      "Research Policy",
+      "Crossref",
+      ...
+]
+```
+
 ### subjects
 _Type: [Subject](other#subject) &bull; Cardinality: MANY_

 Subject, keyword, classification code, or key phrase describing the resource.

+```json
+"subjecsts": [
+    {
+        "provenance": {
+            "provenance": "Harvested",
+            "trust": "0.9"
+        },
+        "subject": {
+            "scheme": "keyword",
+            "value": "Open science"
+        }
+    },
+    ...
+]
+```
 --- 

 ## Sub-types
@ -139,59 +333,127 @@ There are the following sub-types of `Result`. Each inherits all its fields and

 ### Publication

+Metadata records about research literature (includes types of publications listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/publication)).
+
 #### container 
 _Type: [Container](other#container) &bull; Cardinality: ONE_

 Container has information about the conference or journal where the result has been presented or published.

+```json
+"container": {
+    "edition": "",
+    "iss": "5",
+    "issnLinking": "",
+    "issnOnline": "1873-7625",
+    "issnPrinted": "0048-7333",
+    "name": "Research Policy",
+    "sp": "12",
+    "ep": "22",
+    "vol": "50"
+}
+```
 ### Dataset

+Metadata records about research data (includes the subtypes listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/dataset)).
+
 #### size
 _Type: String &bull; Cardinality: ONE_

-The size of the dataset.
+The declared size of the dataset.
+
+```json
+"size": "10129818"
+```

 #### version
 _Type: String &bull; Cardinality: ONE_

 The version of the dataset.

+```json
+"version": "v1.3"
+```
+
 #### geolocation
 _Type: [GeoLocation](other#geolocation) &bull; Cardinality: MANY_

 The list of geolocations associated with the dataset.

+```json
+"geolocation": [
+    {
+        "box": "18.569386 54.468973  18.066832 54.83707",
+        "place": "Tübingen, Baden-Württemberg, Southern Germany",
+        "point": "7.72486 50.1084"
+    },
+    ...
+]
+```
+
 ### Software

+Metadata records about research software (includes the subtypes listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/software)).
+
 #### documentationUrl
 _Type: String &bull; Cardinality: MANY_

 The URLs to the software documentation. 

+```json
+"documentationUrl": [ 
+    "https://github.com/openaire/iis/blob/master/README.markdown",
+    ...
+]
+```
+
 #### codeRepositoryUrl
 _Type: String &bull; Cardinality: ONE_

-The URL to the repository holding the source code,
+The URL to the repository with the source code.
+
+```json
+"codeRepositoryUrl": "https://github.com/openaire/iis"
+```

 #### programmingLanguage
 _Type: String &bull; Cardinality: ONE_

 The programming language.

+```json
+"programmingLanguage": "Java"
+```
+
 ### Other research product

+Metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/other)).
+
 #### contactperson
 _Type: String &bull; Cardinality: MANY_

-The contact person for this ORP.
+Information on the person responsible for providing further information regarding the resource.
+
+```json
+"contactperson": [
+    "Noémie Dominguez",
+    ...    
+]
+```

 #### contactgroup
 _Type: String &bull; Cardinality: MANY_

-The information for the contact group.
+Information on the group responsible for providing further information regarding the resource.
+
+```json
+"contactgroup": [
+    "Networked Multimedia Information Systems (NeMIS)",
+    ...
+]
+```

 #### tool
 _Type: String &bull; Cardinality: MANY_

 Information about tool useful for the interpretation and/or re-use of the research product.
-
--- a/docs/data-model/relationships.md
+++ b/docs/data-model/relationships.md
@ -15,31 +15,66 @@ _Type: [Node](#the-node-object) &bull; Cardinality: ONE_

 Represents the source node in the relation.

+```json
+"source": {
+    "id": "20|openorgs____::1cb75a3ad756e4c83e455e3e7347643b",
+    "type": "organization"
+}
+```
+
 ### target
 _Type: [Node](#the-node-object) &bull; Cardinality: ONE_

 Represents the target node in the relation.

+```json
+"target": {
+    "id": "10|doajarticles::022409068174087a003647ff46070f7f",
+    "type": "datasource"
+}
+```
+
 ### reltype
 _Type: [RelType](#the-reltype-object) &bull; Cardinality: ONE_

 Represent the semantics of the relation between two nodes of the graph.

+```json
+"reltype": {
+    "name": "provides",
+    "type": "provision"
+}
+```
 ### provenance
 _Type: [Provenance](entities/other#provenance-1) &bull; Cardinality: ONE_

 Indicates the process that produced (or provided) the information.

+```json
+"provenance": {
+    "provenance": "Harvested",
+    "trust":"0.900"
+}
+```
+
 ### validated
 _Type: Boolean &bull; Cardinality: ONE_

 Indicates weather or not the relation was validated.

+```json
+"validated": true
+```
+
 ### validationDate
 _Type: String &bull; Cardinality: ONE_

 Indicates the validation date of the relation - applies only when the validated flag is set to true.

+```json
+"validationDate": "2022-09-02"
+```
+
 --- 

 ## The `Node` object
@ -52,11 +87,18 @@ _Type: String &bull; Cardinality: ONE_

 OpenAIRE identifier of the node in the graph.

+```json
+"id": "10|doajarticles::022409068174087a003647ff46070f7f"
+```
+    
 ### type
 _Type: String &bull; Cardinality: ONE_

 Graph node type.

+```json
+"type": "datasource"
+```

 ## The `RelType` object

@ -67,19 +109,25 @@ _Type: String &bull; Cardinality: ONE_

 Relation category, e.g. affiliation, citation, see table Relation typologies.

+```json
+"name": "provides"
+```
+
 ### name
 _Type: String &bull; Cardinality: ONE_

 Further specifies the relation semantic, indicating the relation direction, e.g. Cites, isCitedBy.

-
+```json
+"type": "provision"
+```
 --- 

 ## Relationship types

 The following table lists all the possible relation semantics found in the graph dump.

-|  # | source entity type |  target entity type |  relType.type |         relType.name        |    relType.name (inverse)    |
+|  # | Source entity type |  Target entity type |  Relation type |         Relation name        |    Inverse relation name    |
 |:--:|:------------------:|:-------------------:|:-------------:|:---------------------------:|:----------------------------:|
 | 1  | [Project](entities/project)            | [Result](entities/result)              | outcome       | produces                    | isProducedBy                 |
 | 2  | [Result](entities/result)             | [Organization](entities/organization)        | affiliation   | hasAuthorInstitution        | isAuthorInstitutionOf        |
@ -90,9 +138,9 @@ The following table lists all the possible relation semantics found in the graph
 | 7  | [Data source](entities/data-source)        | [Organization](entities/organization)        | provision     | provides                    | isProvidedBy                 |
 | 8  | [Result](entities/result)             | [Data source](entities/data-source)         | provision     | isHostedBy                  | hosts                        |
 | 9  | [Result](entities/result)             | [Data source](entities/data-source)         | provision     | isProvidedBy                | provides                     |
-| 10 | [Result](entities/result)             | [CommunityInitiative](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |
-| 11 | [Organization](entities/organization)       | [CommunityInitiative](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |
-| 12 | [Data source](entities/data-source)        | [CommunityInitiative](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |
-| 13 | [Project](entities/project)            | [CommunityInitiative](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |
+| 10 | [Result](entities/result)             | [Community](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |
+| 11 | [Organization](entities/organization)       | [Community](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |
+| 12 | [Data source](entities/data-source)        | [Community](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |
+| 13 | [Project](entities/project)            | [Community](entities/community) | relationship  | isRelatedTo                 | isRelatedTo                  |


--- a/docs/data-provision/aggregation.md
+++ b/docs/data-provision/aggregation.md
@ -4,11 +4,13 @@ sidebar_position: 1

 # Aggregation

-OpenAIRE collects metadata records from a variety of content providers as described in https://www.openaire.eu/aggregation-and-content-provision-workflows.
+OpenAIRE collects metadata records from a variety of content providers as described in the [aggregation and content provision workflows](https://www.openaire.eu/aggregation-and-content-provision-workflows).

 OpenAIRE aggregates metadata records describing objects of the research life-cycle from content providers compliant to the [OpenAIRE guidelines](https://guidelines.openaire.eu/) and from entity registries (i.e. data sources offering authoritative lists of entities, like OpenDOAR, re3data, DOAJ, and funder databases). After collection, metadata are transformed according to the OpenAIRE internal metadata model, which is used to generate the final OpenAIRE Research Graph that you can access from the OpenAIRE portal and the APIs.

 The transformation process includes the application of cleaning functions whose goal is to ensure that values are harmonised according to a common format (e.g. dates as YYYY-MM-dd) and, whenever applicable, to a common controlled vocabulary. The controlled vocabularies used for cleansing are accessible at http://api.openaire.eu/vocabularies. Each vocabulary features a set of controlled terms, each with one code, one label, and a set of synonyms. If a synonym is found as field value, the value is updated with the corresponding term. Also, the OpenAIRE Research Graph is extended with other relevant scholarly communication sources that are too big to be integrated via the “normal” aggregation mechanism: DOIBoost (which merges Crossref, ORCID, Microsoft Academic Graph, and Unpaywall), and ScholeXplorer, one of the Scholix hubs offering a large set of links between research literature and data.


-![Aggregation](./assets/aggregation.png)
+<p align="center">
+    <img loading="lazy" alt="Aggregation" src="/img/docs/aggregation.png" width="65%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
+</p>
--- a/docs/data-provision/data-provision.md
+++ b/docs/data-provision/data-provision.md
@ -1,11 +1,7 @@
 # Data provision 

-
-<span className="todo">source: https://graph.openaire.eu/about#tabs_card</span>
-
 OpenAIRE collects metadata records from more than 70K scholarly communication sources from all over the world, including Open Access institutional repositories, data archives, journals. All the metadata records (i.e. descriptions of research products) are put together in a data lake, together with records from Crossref, Unpaywall, ORCID, Grid.ac, and information about projects provided by national and international funders. Dedicated inference algorithms applied to metadata and to the full-texts of Open Access publications enrich the content of the data lake with links between research results and projects, author affiliations, subject classification, links to entries from domain-specific databases. Duplicated organisations and results are identified and merged together to obtain an open, trusted, public resource enabling explorations of the scholarly communication landscape like never before.

-![Architecture](./assets/architecture.png)
-
-<span className="todo">TODO: make this image linkable</span>
-
+<p align="center">
+    <img loading="lazy" alt="Data provision" src="/img/docs/architecture.png" width="80%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
+</p>
--- a/docs/data-provision/deduplication/clustering-functions.md
+++ b/docs/data-provision/deduplication/clustering-functions.md
@ -2,7 +2,6 @@
 sidebar_position: 3
 ---
 # Clustering functions 
-<span className="todo">TODO</span>

 ## NgramPairs
 It produces a list of concatenations of a pair of ngrams generated from different words.<br />
--- a/docs/data-provision/deduplication/deduplication.md
+++ b/docs/data-provision/deduplication/deduplication.md
@ -1,7 +1,5 @@
 # Deduplication

-<span className="todo">TODO: intro</span>
-
 ## Clustering 

 Clustering is a common heuristics used to overcome the N x N complexity required to match all pairs of objects to identify the equivalent ones. The challenge is to identify a clustering function that maximizes the chance of comparing only records that may lead to a match, while minimizing the number of records that will not be matched while being equivalent. Since the equivalence function is to some level tolerant to minimal errors (e.g. switching of characters in the title, or minimal difference in letters), we need this function to be not too precise (e.g. a hash of the title), but also not too flexible (e.g. random ngrams of the title). On the other hand, reality tells us that in some cases equality of two records can only be determined by their PIDs (e.g. DOI) as the metadata properties are very different across different versions and no clustering function will ever bring them into the same cluster. To match these requirements OpenAIRE clustering for products works with two functions:
--- a/docs/data-provision/deduplication/research-products.md
+++ b/docs/data-provision/deduplication/research-products.md
@ -33,7 +33,9 @@ Cross comparison of the pid lists (in the `pid` and `alternateid` elements). If
 Otherwise, check if the number of authors and the title version is equal. If so, levenshtein distance on titles with higher threshold (0.99).
 The publications are matched as duplicate if the distance is higher than the threshold, in every other case they are considered as distinct publications.

-![Example banner](../assets/dedup-results.png)
+<p align="center">
+    <img loading="lazy" alt="Deduplication workflow" src="/img/docs/dedup-results.png" width="80%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
+</p>

 #### Creation of representative record
 <span className="todo">TODO</span>
--- a/docs/data-provision/enrichment/enrichment.md
+++ b/docs/data-provision/enrichment/enrichment.md
@ -1,8 +1,5 @@
 # Enrichment

-
-<span className="todo">TODO: intro</span>
-
 ## Mining

 The OpenAIRE Research Graph is enriched by links mined by OpenAIRE’s full-text mining algorithms that scan the plaintexts of publications for funding information, references to datasets, software URIs, accession numbers of bioetities, and EPO patent mentions. Custom mining modules also link research objects to specific research communities, initiatives and infrastructures. In addition, other inference modules provide content-based document classification, document similarity, citation matching, and author affiliation matching.
--- a/docs/intro.md
+++ b/docs/intro.md
@ -4,7 +4,7 @@ id: intro
 sidebar_position: 1
 ---

-# Welcome! 
+# Overview 

 The OpenAIRE Research Graph is one of the largest open scholarly record collections worldwide, key in fostering Open Science and establishing its practices in the daily research activities.
 Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back in the hands of the scientific community.
@ -21,6 +21,5 @@ As of today, the OpenAIRE Research Graph aggregates around 450Mi metadata record
 * Microsoft Academic Graph
 * Datacite

-After cleaning, deduplication, enrichment and full-text mining processes, the graph is analysed to produce statistics for the [OpenAIRE MONITOR](https://monitor.openaire.eu), the [Open Science Observatory](https://osobservatory.openaire.eu), made discoverable via the [OpenAIRE EXPLORE](https://explore.openaire.eu) and programmatically accessible as described at 
-<span className="todo">https://develop.openaire.eu</span>.
-Json dumps are also published on Zenodo.
+After cleaning, deduplication, enrichment and full-text mining processes, the graph is analysed to produce statistics for the [OpenAIRE MONITOR](https://monitor.openaire.eu), the [Open Science Observatory](https://osobservatory.openaire.eu), made discoverable via the [OpenAIRE EXPLORE](https://explore.openaire.eu) and programmatically accessible via [OpenAIRE Public APIs](https://develop.openaire.eu).
+Last but not least, frequently updated [JSON dumps](download) are published on Zenodo.
--- a/docs/publications.md
+++ b/docs/publications.md
@ -2,5 +2,71 @@
 sidebar_position: 7
 ---

-# Related publications
-<span className="todo">TODO</span>
+# How to cite
+
+Open Science services are open and transparent and survive thanks to your active support and to the visibility and reward they gather. If you use one of the [OpenAIRE Research Graph dumps](https://zenodo.org/record/6616871) for your research, please provide a proper citation following the recommendation that you find on the dump's Zenodo page. 
+
+## Relevant research products
+
+### Aggregation system
+
+Manghi, P., Artini, M., Atzori, C., Bardi, A., Mannocci, A., La Bruzzo, S., Candela, L., Castelli, D. and Pagano, P. (2014), “The D-NET software toolkit: A framework for the realization, maintenance, and operation of aggregative infrastructures”, Program: electronic library and information systems, Vol. 48 No. 4, pp. 322-354. [doi:10.1108/prog-08-2013-0045](http://doi.org/10.1108/prog-08-2013-0045)
+
+Atzori, C., Bardi, A., Manghi, P., & Mannocci, A. (2017, January). "The OpenAIRE workflows for data management". In Italian Research Conference on Digital Libraries (pp. 95-107). Springer, Cham. [doi:10.1007/978-3-319-68130-6_8](https://doi.org/10.1007/978-3-319-68130-6_8)
+
+*Software* Michele Artini, Claudio Atzori, Alessia Bardi, Sandro La Bruzzo, Paolo Manghi, & Andrea Mannocci. (2016, November 24). "The D-NET software toolkit: dnet-basic-aggregator (Version 1.3.0)". Zenodo. [doi:10.5281/zenodo.168356](https://doi.org/10.5281/zenodo.168356) <i className="fa-solid fa-arrow-up-right-from-square"></i>
+
+Mannocci, A., & Manghi, P. (2016, September). "DataQ: a data flow quality monitoring system for aggregative data infrastructures". In International Conference on Theory and Practice of Digital Libraries (pp. 357-369). Springer, Cham. [doi:10.1007/978-3-319-43997-6_28](https://doi.org/10.1007/978-3-319-43997-6_28)
+
+### Deduplication
+
+Vichos K., De Bonis M., Kanellos I., Chatzopoulos S., Atzori C., Manola N., Manghi P., Vergoulis T. (Feb. 2022), "A preliminary assessment of the article deduplication algorithm used for the OpenAIRE Research Graph". IRCDL 2022 - 18th Italian Research Conference on Digital Libraries, Padua, Italy. CEUR-WS Proceedings. [http://ceur-ws.org/Vol-3160](http://ceur-ws.org/Vol-3160/) 
+
+De Bonis, M., Manghi, P., & Atzori, C. (2022). "FDup: a framework for general-purpose and efficient entity deduplication of record collections". PeerJ Computer Science, 8, e1058. [https://peerj.com/articles/cs-1058](https://peerj.com/articles/cs-1058)
+
+Manghi, P., Atzori, C., De Bonis, M., & Bardi, A. (2020). "Entity deduplication in big data graphs for scholarly communication". Data Technologies and Applications. [doi:10.1108/dta-09-2019-0163](https://doi.org/10.1108/dta-09-2019-0163)
+
+
+Atzori, C., Manghi, P., & Bardi, A. (2018, December). "GDup: de-duplication of scholarly communication big graphs". In 2018 IEEE/ACM 5th International Conference on Big Data Computing Applications and Technologies (BDCAT) (pp. 142-151). IEEE. [doi:10.1109/bdcat.2018.00025](https://doi.org/10.1109/bdcat.2018.00025)
+
+*Software* Claudio Atzori, & Paolo Manghi. (2017, February 17). "GDup: a big graph entity deduplication system" (Version 4.0.5). Zenodo.  [doi:/10.5281/zenodo.292980](https://doi.org/10.5281/zenodo.292980)
+
+Atzori, Claudio. "GDup: an Integrated, Scalable Big Graph Deduplication System." (2016). [doi:10.5281/zenodo.1454879](https://doi.org/10.5281/zenodo.1454879)
+
+Manghi, Paolo, Marko Mikulicic, and Claudio Atzori. "De-duplication of aggregation authority files." International Journal of Metadata, Semantics and Ontologies 7.2 (2012): 114-130. [doi:10.1504/ijmso.2012.050014](https://doi.org/10.1504/ijmso.2012.050014)
+
+Manghi, P., & Mikulicic, M. (2011, October). "PACE: A general-purpose tool for authority control". In Research Conference on Metadata and Semantic Research (pp. 80-92). Springer, Berlin, Heidelberg. [doi:10.1007/978-3-642-24731-6_8](https://doi.org/10.1007/978-3-642-24731-6_8)
+
+### Mining
+
+Giannakopoulos T., Foufoulas Y., Dimitropoulos H., Manola N. (2019) “Interactive Text Analysis and Information Extraction”. In: Manghi P., Candela L., Silvello G. (eds) Digital Libraries: Supporting Open Science. IRCDL 2019. Communications in Computer and Information Science, vol 988. Springer, Cham. [doi:10.1007/978-3-030-11226-4_27](https://doi.org/10.1007/978-3-030-11226-4_27)
+
+Foufoulas Y., Stamatogiannakis L., Dimitropoulos H., Ioannidis Y. (2017) “High-Pass Text Filtering for Citation Matching”. In: Kamps J., Tsakonas G., Manolopoulos Y., Iliadis L., Karydis I. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2017. Lecture Notes in Computer Science, vol 10450. Springer, Cham. [doi:10.1007/978-3-319-67008-9_28](https://doi.org/10.1007/978-3-319-67008-9_28)
+
+Y. Chronis, Y. Foufoulas, V. Nikolopoulos, A. Papadopoulos, L. Stamatogiannakis, C. Svingos, Y. E. Ioannidis, "A Relational Approach to Complex Dataflows", in Workshop Proceedings of the EDBT/ICDT 2016 (MEDAL 2016) Joint Conference (March 15, 2016, Bordeaux, France) on CEUR-WS.org (ISSN 1613-0073) [http://ceur-ws.org/Vol-1558/paper45.pdf](http://ceur-ws.org/Vol-1558/paper45.pdf)
+
+T. Giannakopoulos, I. Foufoulas, E. Stamatogiannakis, H. Dimitropoulos, N. Manola, and Y. Ioannidis. 2015. “Visual-Based Classification of Figures from Scientific Literature”. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion). Association for Computing Machinery, New York, NY, USA, 1059–1060. [doi:10.1145/2740908.2742024](https://doi.org/10.1145/2740908.2742024)
+
+Giannakopoulos, T., Foufoulas, I., Stamatogiannakis, E., Dimitropoulos, H., Manola, N., & Ioannidis, Y. (2014). “Discovering and Visualizing Interdisciplinary Content Classes in Scientific Publications”. D-Lib Mag., Volume 20, Number 11/12. [doi:10.1045/november14-giannakopoulos](https://doi.org/10.1045/november14-giannakopoulos)
+
+Giannakopoulos T., Stamatogiannakis E., Foufoulas I., Dimitropoulos H., Manola N., Ioannidis Y. (2014) “Content Visualization of Scientific Corpora Using an Extensible Relational Database Implementation”. In: Bolikowski Ł., Casarosa V., Goodale P., Houssos N., Manghi P., Schirrwagen J. (eds) Theory and Practice of Digital Libraries -- TPDL 2013 Selected Workshops. TPDL 2013. Communications in Computer and Information Science, vol 416. Springer, Cham. [doi:10.1007/978-3-319-08425-1_10](https://doi.org/10.1007/978-3-319-08425-1_10) 
+
+Giannakopoulos T., Dimitropoulos H., Metaxas O., Manola N., Ioannidis Y. (2013) “Supervised Content Visualization of Scientific Publications: A Case Study on the ArXiv Dataset”. In: Kłopotek M.A., Koronacki J., Marciniak M., Mykowiecka A., Wierzchoń S.T. (eds) Language Processing and Intelligent Information Systems. IIS 2013. Lecture Notes in Computer Science, vol 7912. Springer, Berlin, Heidelberg. [doi:10.1007/978-3-642-38634-3_23](https://doi.org/10.1007/978-3-642-38634-3_23)
+ 
+Tkaczyk, D., Szostek, P., Fedoryszak, M. et al. "CERMINE: automatic extraction of structured metadata from scientific literature". IJDAR 18, 317–335 (2015). [doi:10.1007/s10032-015-0249-8](https://doi.org/10.1007/s10032-015-0249-8)
+
+M. Kobos, Ł. Bolikowski, M. Horst, P. Manghi, N. Manola, J. Schirrwagen (2014) “Information inference in scholarly communication infrastructures: the OpenAIREplus project experience”, Procedia Computer Science 38, 92-99. [doi:10.1016/j.procs.2014.10.016](https://doi.org/10.1016/j.procs.2014.10.016)
+
+### Portals
+
+Baglioni M. et al. (2019) "The OpenAIRE Research Community Dashboard: On Blending Scientific Workflows and Scientific Publishing". In: Doucet A., Isaac A., Golub K., Aalberg T., Jatowt A. (eds) Digital Libraries for Open Knowledge. TPDL 2019. Lecture Notes in Computer Science, vol 11799. Springer, Cham. [doi:10.1007/978-3-030-30760-8_5](https://doi.org/10.1007/978-3-030-30760-8_5)
+
+### Broker Service
+
+Manghi, P., Atzori, C., Bardi, A., La Bruzzo, S., & Artini, M. (2016, February). "Realizing a Scalable and History-Aware Literature Broker Service for OpenAIRE". In Italian Research Conference on Digital Libraries (pp. 92-103). Springer, Cham. [doi:10.1007/978-3-319-56300-8_9](https://doi.org/10.1007/978-3-319-56300-8_9)
+
+Artini, M., Atzori, C., Bardi, A., La Bruzzo, S., Manghi, P., & Mannocci, A. (2015). "The OpenAIRE literature broker service for institutional repositories". D-Lib Magazine, 21(11/12), 1. [doi:10.1045/november2015-artini](https://doi.org/10.1045/november2015-artini)
+
+
+
+
--- a/docusaurus.config.js
+++ b/docusaurus.config.js
@ -91,15 +91,19 @@ const config = {
            type: 'doc',
            docId: 'intro',
            position: 'left',
-            label: 'Research graph',
+            label: 'Research graph v5.0',
          },
+          // 
+          // documentation version in the navbar
          // {
-          //   type: 'doc',
-          //   docId: 'intro',
-          //   position: 'left',
-          //   label: 'docs',
+          //   type: 'docsVersionDropdown', 
+          //   position: 'right'
          // },
+          // 
+          // link to blog, the blog must be enabled first
          // {to: '/blog', label: 'Blog', position: 'left'},
+          // 
+          // link to github repo
          // {
          //   href: 'https://github.com/facebook/docusaurus',
          //   label: 'GitHub',
--- a/sidebar-utils.js
+++ b/sidebar-utils.js
@ -11,8 +11,8 @@ function filterItems(items, itemsToFilter) {
  
    // filter out items in current level
    return result.filter( item => !itemsToFilter.includes(item.id) );
-  }
+}

-  module.exports = {
+module.exports = {
    filterItems
-  };
+};
--- a/sidebars.js
+++ b/sidebars.js
@ -13,19 +13,106 @@

 /** @type {import('@docusaurus/plugin-content-docs').SidebarsConfig} */
 const sidebars = {
-  // By default, Docusaurus generates a sidebar from the docs folder structure
-  tutorialSidebar: [{type: 'autogenerated', dirName: '.'}],
-
-  // But you can create a sidebar manually
-  /*
-  tutorialSidebar: [
+  mySidebar: [
    {
-      type: 'category',
-      label: 'Tutorial',
-      items: ['hello'],
+      type: 'doc', 
+      id: 'intro'
    },
-  ],
-   */
+    {
+      type: 'category', 
+      label: "Data model",
+      link: {type: 'doc', id: 'data-model/data-model'},
+      items: [
+        {
+          type: 'category', 
+          label: "Entities",
+          link: {
+            type: 'generated-index',
+            description: 'The main entities of the OpenAIRE Research Graph are listed below.'
+          },
+          items: [
+            { type: 'doc', id: 'data-model/entities/result' },
+            { type: 'doc', id: 'data-model/entities/data-source' },
+            { type: 'doc', id: 'data-model/entities/organization' },
+            { type: 'doc', id: 'data-model/entities/project' },
+            { type: 'doc', id: 'data-model/entities/community' },
+          ]
+        }, 
+        {
+          type: 'doc', 
+          id: 'data-model/relationships'
+        }
+      ]
+    },
+    {
+      type: "link",
+      label: "Public API",
+      href: "https://graph.openaire.eu/develop/overview.html"
+    },
+    {
+      type: 'doc', 
+      id: 'download'
+    },
+    {
+      type: 'category', 
+      label: "Data provision",
+      link: {type: 'doc', id: 'data-provision/data-provision'},
+      items: [
+        { type: 'doc', id: 'data-provision/aggregation' },
+        {
+          type: 'category', 
+          label: "Deduplication",
+          link: {type: 'doc', id: 'data-provision/deduplication/deduplication'},
+          items: [
+            { type: 'doc', id: 'data-provision/deduplication/research-products' },
+            { type: 'doc', id: 'data-provision/deduplication/organizations' },
+          ]
+        }, 
+        {
+          type: 'category', 
+          label: "Enrichment",
+          link: {type: 'doc', id: 'data-provision/enrichment/enrichment'},
+          items: [
+            { type: 'doc', id: 'data-provision/enrichment/mining' },
+            { type: 'doc', id: 'data-provision/enrichment/impact-scores' },
+          ]
+        },
+        { type: 'doc', id: 'data-provision/post-cleaning' },
+        { type: 'doc', id: 'data-provision/indexing' },
+        { type: 'doc', id: 'data-provision/stats' },
+      ]
+    },
+    {
+      type: 'doc', 
+      id: 'services'
+    },
+    {
+      type: 'category', 
+      label: "Learning center",
+      link: { type: 'generated-index' },
+      items: [
+        { type: 'doc', id: 'learning-center/open-plato' },
+        { type: 'doc', id: 'learning-center/tutorials' },
+      ]
+    },
+    {
+      type: 'doc', 
+      id: 'publications',
+      label: "Relevant publications"
+    },
+    {
+      type: 'doc', 
+      id: 'faq'
+    },
+    {
+      type: 'doc', 
+      id: 'license'
+    },  
+    {
+      type: 'doc', 
+      id: 'changelog'
+    },
+  ]
 };

 module.exports = sidebars;
--- a/docs/data-provision/assets/aggregation.png
+++ b/docs/data-provision/assets/aggregation.png
--- a/docs/data-provision/assets/architecture.png
+++ b/docs/data-provision/assets/architecture.png
--- a/docs/data-provision/assets/dedup-results.png
+++ b/docs/data-provision/assets/dedup-results.png
Author	SHA1	Message	Date
Paolo Manghi	ceb8a070b5	Update 'docs/publications.md'	2022-10-12 12:21:14 +02:00
Serafeim Chatzopoulos	3410ca7dc0	Add changelog.md; add version in navbar; update readme	2022-09-29 19:00:42 +03:00
Serafeim Chatzopoulos	cb0e7a921a	Static sidebar && add publications	2022-09-23 19:00:46 +03:00
Serafeim Chatzopoulos	2d8c3ad241	Add example to all fields in entities and relationshipds	2022-09-23 17:19:32 +03:00
Serafeim Chatzopoulos	87f3892372	Add examples to result properties	2022-09-22 21:21:18 +03:00