Compare commits

..

5 Commits

Author SHA1 Message Date
Serafeim Chatzopoulos c3419f9a71 Prepare publication page for full screen 2024-01-17 10:27:31 +02:00
Serafeim Chatzopoulos a69f06b618 Add css for publication page embedding && add json code highlighting 2024-01-15 21:34:58 +02:00
Serafeim Chatzopoulos 503a8db513 Adjust MDX changes 2024-01-15 20:43:53 +02:00
Serafeim Chatzopoulos f7b18a6524 Adjust MDX changes 2024-01-15 20:37:30 +02:00
Serafeim Chatzopoulos 6bce4961d8 Upgrade package.json && docusaurus.config.js 2024-01-15 20:11:26 +02:00
632 changed files with 7980 additions and 44834 deletions

View File

@ -98,9 +98,7 @@ The Advanced Authentication method allows the OpenAIRE AAI server to verify that
To have access to the following functionalities you need to login to OpenAIRE. In case you are not already a member you will need to register first and provide your [Personal information](https://develop.openaire.eu/personal-info).
:::info New!
The registration process has been updated! In order to visit the Personal Token and Registered Services functionalities you need to fill in the Personal Information form available [here](https://develop.openaire.eu/personal-info). This update will not affect the operation of your existing services. However, if you want to register a new service or access/modify an existing one, you will need to provide your personal information first.
:::
New! The registration process has been updated! In order to visit the Personal Token and Registered Services functionalities you need to fill in the Personal Information form available [here](https://develop.openaire.eu/personal-info). This update will not affect the operation of your existing services. However, if you want to register a new service or access/modify an existing one, you will need to provide your personal information first.
For the **Basic Authentication** method the OpenAIRE AAI server generates a pair of _Client ID_ and _Client Secret_ for your service upon its registration. The service uses the client id and client secret to obtain the access token for the OpenAIRE APIs. The OpenAIRE AAI server checks whether the client id and client secret sent is valid.
@ -126,7 +124,7 @@ curl -u {CLIENT_ID}:{CLIENT_SECRET} \
-d 'grant_type=client_credentials'
```
where **{CLIENT_ID}** and **{CLIENT_SECRET}** are the _Client ID_ and _Client Secret_ assigned to your service upon registration.
where **\{CLIENT_ID\}** and **\{CLIENT_SECRET\}** are the _Client ID_ and _Client Secret_ assigned to your service upon registration.
The response is:
```json
@ -283,9 +281,9 @@ To make an access token request use the _signed JWT_ that you created in **Step
curl -k -X POST "https://aai.openaire.eu/oidc/token" \
-d "grant_type=client_credentials" \
-d "client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer" \
-d "client_assertion={signedJWT}"
-d "client_assertion=\{signedJWT\}"
```
where **{signedJWT}** is the signed JWT created in **Step 1**.
where **\{signedJWT\}** is the signed JWT created in **Step 1**.
The response is:
```json

View File

@ -47,4 +47,4 @@ APIs are free-to-use (no sign-up needed) by any third-party service
**Quality of Service**: all API services are running in production 24/7 within the OpenAIRE infrastructure premises deployed at the [data center](http://icm.edu.pl/en/centre-of-technology/) facilities of the [Interdisciplinary Centre for Mathematical and Computational Modelling](http://icm.edu.pl/en/) (ICM).
**APIs rate limits**: please check [here](./authentication).
**APIs rate limits**: please check [here](./authenticated-requests).

View File

@ -58,4 +58,4 @@ APIs are free-to-use (no sign-up needed) by any third-party service.
**Quality of Service**: all API services are running in production 24/7 within the OpenAIRE infrastructure premises deployed at the [data center](http://icm.edu.pl/en/centre-of-technology/) facilities of the [Interdisciplinary Centre for Mathematical and Computational Modelling](http://icm.edu.pl/en/) (ICM).
**APIs rate limits**: please check [here](./authentication).
**APIs rate limits**: please check [here](./authenticated-requests).

View File

@ -1,9 +1,9 @@
# Public APIs
The OpenAIRE Graph data are accessible through various public APIs. More specifically, the following APIs are currently provided:
* [Search API](./search-api/search-api.md) (an API to search for research products and projects)
* [ScholeXplorer API](https://api.scholexplorer.openaire.eu/swagger-ui/index.html?urls.primaryName=Scholexplorer%20API%20V2.0) (an API offering dataset-publication & dataset-dataset links)
* [DSpace & EPrints API](./dspace-eprints-api.md) (an API to offer custom access to metadata for projects funded by a selection of international funders for DSpace and EPrints platforms)
* [Broker API](./broker-api.md) (an API to enrich metadata for repositories, publishers, and aggregators)
* Search API (an API to search for research results and projects)
* ScholeXplorer API (an API offering dataset-publication & dataset-dataset links)
* DSpace & EPrints API (an API to offer custom access to metadata for projects funded by a selection of international funders for DSpace and EPrints platforms)
* Broker API (an API to enrich metadata for repositories, publishers, and aggregators)
It is also worth mentioning that, between 2015 and 2023 a LOD API was being provided but the respective service has been discontinued. Old LOD datasets can be found on Zenodo [here](https://zenodo.org/records/4587369).

View File

@ -1,4 +1,4 @@
# Searching for research products
# Searching for research results
## Endpoints
@ -21,7 +21,7 @@ Endpoint: https://api.openaire.eu/search/researchProducts
| size | integer | Number of results per page. |
| format | json \| xml \| csv \| tsv | The format of the response. The default is xml. |
| model | openaire \| sygma | The data model of the response. Default is openaire. Model sygma is a simplified version of the openaire model. For sygma, only the xml format is available. The relative XML schema is available [here](https://www.openaire.eu/schema/sygma/oaf_sygma_v2.1.xsd). |
| sortBy | `sortBy=field,[ascending\|descending]` <br/>**'field'** can one of: <ul> <li>`dateofcollection`</li><li>`resultstoragedate`</li><li>`resultstoragedate`</li> <li>`resultembargoenddate`</li><li>`resultembargoendyear`</li><li>`resultdateofacceptance`</li> <li>`resultacceptanceyear`</li><li>`influence`</li><li>`popularity`</li> <li>`citationCount`</li><li>`impulse`</li> </ul>Multiple sorting is supported by repeating the `sortBy` parameter. | The sorting order of the specified field. |
| sortBy | `sortBy=field,[ascending\|descending]`; **'field'** is one of: `dateofcollection`,`resultstoragedate`,`resultstoragedate`, `resultembargoenddate`,`resultembargoendyear`,`resultdateofacceptance`, `resultacceptanceyear`,`influence`,`popularity`, `citationCount`,`impulse` <br/>Multiple sorting is supported by repeating the `sortBy` parameter. | The sorting order of the specified field. |
| hasECFunding | true \| false | If hasECFunding is true gets the entities funded by the EC. If hasECFunding is false gets the entities related to projects not funded by the EC. |
| hasWTFunding | true \| false | If hasWTFunding is true gets the entities funded by Wellcome Trust. The results are the same as those obtained with `funder=wt`. If hasWTFunding is false gets the entities related to projects not funded by Wellcome Trust. |
| funder | WT \| EC \| ARC \| ANDS \| NSF \| FCT \| NHMRC | Search for entities by funder. |

View File

@ -1,7 +1,3 @@
# Search API
The Search API allows developers to access metadata records of the OpenAIRE Graph by performing queries over research products (i.e., publications, data, software, other research products), and projects.
The API is intended for metadata discovery and exploration only, hence it does not provide access to the whole information space: the number of total results returned by one query is limited to 10,000.
For accessing the whole graph, developers are encouraged to use the [OpenAIRE full Graph dataset](../../downloads/full-graph).
The Search API allows developers to access metadata records of the OpenAIRE Graph by performing queries over research results (i.e., publications, data, software, other research products), and projects. The API is intended for metadata discovery and exploration only, hence it does not provide access to the whole information space: the number of total results returned by one query is limited to 10,000. For accessing the whole graph, developers are encouraged to use the [OpenAIRE full Graph dataset](../../downloads/full-graph).

View File

@ -10,19 +10,19 @@
| 2022-09-28T20:35:13.116653Z | updated URLs to the broker swagger UI |
| 2022-07-28T12:02:06.271154Z | Updated list of funders supported by the API for bulk access to projects: EC Horizon Europe also included |
| 2022-05-11T10:01:33.969973Z | New end point for researchProducts in selective access! FOS and SDG classifications available for publication requests |
| 2022-03-29T15:03:29.583536Z | Graph dataset: add new Scholix version 4 |
| 2022-03-29T15:03:29.583536Z | Graph dumps: add new Scholix version 4 |
| 2021-11-12T12:04:52.900385Z | originalId parameter added |
| 2021-10-18T15:31:18.446582Z | OAI-PMH publisher completely dismissed as announced in January 2021 |
| 2021-10-12T07:46:48.032978Z | orcid parameter added in selective access |
| 2021-04-08T10:28:02.371361Z | Authenticated requests to our APIs are now enabled. |
| 2021-02-26T16:28:15.364435Z | NEWS: new dataset available with research products with project funding information |
| 2021-02-26T16:28:15.364435Z | NEWS: new dump available with research products with project funding information |
| 2021-02-17T07:39:46.051129Z | WIP: broker API documentation |
| 2021-02-11T09:06:41.608115Z | Broker API documentation |
| 2021-02-10T10:17:39.504429Z | Authentication documentation added + broker card + broker dummy page |
| 2021-02-01T08:55:35.496938Z | OAI-PMH shutdown announced for the end of April 2021 |
| 2021-01-15T18:56:04.748404Z | Updated documentation on OpenAIRE Research Graph Datasets |
| 2021-01-15T18:56:04.748404Z | Updated documentation on OpenAIRE Research Graph dumps |
| 2021-01-15T16:57:08.569766Z | Announcing the shutdown of the OAI-PMH publisher |
| 2019-01-25T15:36:27.264313Z | Added new parameter country for research products |
| 2019-01-25T15:36:27.264313Z | Added new parameter country for research results |
| 2018-10-17T10:39:56.570815Z | Software and Other research products are available via HTTP API. Documentation has been updated. |
| 2018-04-09T09:20:24.763966Z | Added section on terms of services and SLA in the specific API pages |
| 2018-04-09T08:26:18.897089Z | Added section for terms of use and SLA in the home page |
@ -80,7 +80,7 @@
| 2014-04-30T10:41:14.539090Z | Added and commented property to generate output in chunks |
| 2014-04-30T10:40:30.012256Z | mvn generates output with no chunks in a single file: api-doc.html |
| 2014-04-30T10:39:37.875730Z | Main docbook file renamed from book.xml to api-doc.xml |
| 2014-04-30T10:34:16.576722Z | updated OAI-PMH sets: now delivering only research products and no other entities. |
| 2014-04-30T10:34:16.576722Z | updated OAI-PMH sets: now delivering only results and no other entities. |
| 2014-04-15T09:53:22.158487Z | copied dnet-api-http-doc to new dnet40 codebase |
| 2014-04-10T09:55:41.690052Z | ignore |
| 2014-04-10T09:53:59.192401Z | removed target/*classes from svn |

View File

@ -4,7 +4,7 @@
The OpenAIRE APIs are free-to-use by any third-party service and can be accessed over HTTPS both by authenticated and unauthenticated requests. The rate limit for the former type of requests is up to 7200 requests per hour, while the latter is up to 60 requests per hour.
To make an authenticated request, you must first [register](https://services.openaire.eu/uoa-user-management/register.jsp). Then, you can go to the [personal access token page](https://develop.openaire.eu/user-info?errorCode=1&redirectUrl=%2Fpersonal-token) in your account, copy your token and use it for up to one hour, [find out more](./authentication).
To make an authenticated request, you must first [register](https://services.openaire.eu/uoa-user-management/register.jsp). Then, you can go to the [personal access token page](https://develop.openaire.eu/user-info?errorCode=1&redirectUrl=%2Fpersonal-token) in your account, copy your token and use it for up to one hour, [find out more](./authenticated-requests).
Our OAuth 2.0 implementation, conforms to the OpenID Connect specification, and is [OpenID Certified](https://openid.net/certification/). OpenID Connect is a simple identity layer on top of the OAuth 2.0 protocol. For more information about OAuth2.0 please visit the [OAuth2.0 official site](https://oauth.net/2/). For more information about OpenID Connect please visit the [OpenID Connect official site](https://openid.net/connect/). Also, check [here](http://www.openaire.eu/privacy-policy) for more information on our Privacy Policy.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 666 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 203 KiB

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 221 KiB

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 118 KiB

After

Width:  |  Height:  |  Size: 32 KiB

View File

@ -19,87 +19,8 @@ This section documents all notable changes for each graph version.
---
### v7.1.3
_Start Date: 2024-04-10 &bull; Release Date: 2024-04-22 &bull; Dataset release: **no**_
#### Added
- Introduced new Field of Science classifications, reaching a total of ~73Mi publications classified
- General increase of the funded scientific outputs, thanks to the full-text mining scanning new OpenAccess publications, some examples:
- European Commission - EC +7% (from 1.52Mi to 1.62Mi)
- Irish Research Council - IRC +7% (from 12.7K to 13.5K)
- French National Research Agency - ANR +5.8% (from 91.5K to 96.8K)
- National Institute of Health - NIH +5% (from 594K to 626K)
- UK Research and Innovation - UKRI +3.7% (from 434K to 450K)
- General increase of the scientific products with author affiliation information +2% (from 83.12Mi to 84.88Mi)
#### Changed
- Updated Crossref publications to include contents until March 2023
- Updated Datacite contents until March 2024
- Updated ORCID contents until March 2024
### v7.1.2
_Start Date: 2024-03-15 &bull; Release Date: 2024-03-27 &bull; Dataset release: **no**_
#### Added
- General increase of the funded scientific outputs, thanks to the full-text mining scanning new OpenAccess publications
#### Changed
- Updated Crossref publications to include contents until February 2023
- Updated Datacite contents until February 2024
- Updated ORCID contents until February 2024
### v7.1.1
_Start Date: 2024-02-23 &bull; Release Date: 2024-03-06 &bull; Dataset release: **no**_
#### Added
- Updated the content import criteria applied to Datacite, resulting in +13Mi Other Research Products (+167%)
- Introduced project PIDs; DOI currently available for grants funded by FCT and TWCF
#### Changed
- Scientific products typed as "Collection" categorized under "Research Data" instead of "Other Research Product".
- Updated Crossref publications to include contents until January 2023
- Updated Datacite contents until January 2024
### v7.1.0
_Start Date: 2024-01-30 &bull; Release Date: 2024-02-20 &bull; Dataset release: **no**_
#### Added
- The scientific products aggregated increased by ~5Mi records (+1.6%)
#### Changed
- A refined version of the deduplication strategy allowed to catch more duplicates among the scientific products, implying
a decrease of their total number of ~3.2Mi (-1.35%). More details about the deduplication algorithm are available [here](graph-production-workflow/deduplication/research-products).
- Updated Crossref publications to include contents until November 2023
- Updated Datacite contents until December 2023
### v7.0.0
_Start Date: 2023-12-18 &bull; Release Date: 2024-01-06 &bull; Dataset release: **yes**_
#### Added
- the scientific products increased by ~3Mi records (+1.26%)
- the number of relations increased by 28.6Mi (+1%)
- the funded contents increased by 5%, from 3.6Mi to 3,8Mi. Funders that recorded the highest increase include, for example, EC with +120K linked research products, and SFI with +1K products.
#### Changed
This graph release also introduces new fields to identify reseach products published using specific open access models, in diamond journals, and those that received public funding. These fields will also be added to the graph dataset in Zenodo. In details:
- `ResearchProduct.isGreen (true, false)`: indicates whether or not the researh product was published following the green open access model;
- `ResearchProduct.openAccesColor (bronze, gold, hybrid)`: indicates the specific open access model used for the publication;
- `ResearchProduct.isInDiamondJournal (true, false)`: indicates whether or not the research product was published in a diamond journal;
- `ResearchProduct.publicly-funded (true, false)`: indicates whether or not the grants acknowledged by the publication come from public funds.
### v6.2.2
_Start Date: 2023-11-07 &bull; Release Date: 2023-11-23 &bull; Dataset release: **no**_
_Start Date: 2023-11-07 &bull; Release Date: 2023-11-23 &bull; Dump release: **no**_
#### Added
- Imported Opencitation's POCI dataset, containing citations among publications in PubMed
@ -116,10 +37,10 @@ _Start Date: 2023-11-07 &bull; Release Date: 2023-11-23 &bull; Dataset release:
- Indicators regarding data source downloads and views taken by usage counts from September 2023
### v6.1.1
_Start Date: 2023-09-11 &bull; Release Date: 2023-10-15 &bull; Dataset release: **no**_
_Start Date: 2023-09-11 &bull; Release Date: 2023-10-15 &bull; Dump release: **no**_
#### Added
- Affiliation (research product to organization) relations from Crossref
- Affiliation (result to organization) relations from Crossref
- Links to the full text of research products
- Cleaning for author and publisher names (get rid of tabs, CR characters, \n(s), escape double quotes)
@ -132,12 +53,12 @@ _Start Date: 2023-09-11 &bull; Release Date: 2023-10-15 &bull; Dataset release:
- OpenCitations relations from December 2022
### v6.0.0
_Start Date: 2023-07-26 &bull; Release Date: 2023-08-16 &bull; Dataset release: **yes**_
_Start Date: 2023-07-26 &bull; Release Date: 2023-08-16 &bull; Dump release: **yes**_
#### Changed
- [Relationship data model](./data-model/relationships/relationship-object): flattened properties source, sourceType, target, targetType
- BIP! indicators are now serialised as an array; see the updated model [here](./data-model/entities/other#bipindicators)
- [Relationship data model](/data-model/relationships/relationship-object): flattened properties source, sourceType, target, targetType
- BIP! indicators are now serialised as an array; see the updated model [here](/data-model/entities/other#bipindicators)
- Crossref dump from June 2023
- ORCID works without a DOI from June 2023
- Usage counts from June 2023
@ -148,7 +69,7 @@ _Start Date: 2023-07-26 &bull; Release Date: 2023-08-16 &bull; Dataset release:
### v5.2.0
_Start Date: 2023-07-03 &bull; Release Date: 2023-07-17 &bull; Dataset release: **no**_
_Start Date: 2023-07-03 &bull; Release Date: 2023-07-17 &bull; Dump release: **no**_
#### Added
- Citations imported from Crossref & MAG
@ -167,7 +88,7 @@ _Start Date: 2023-07-03 &bull; Release Date: 2023-07-17 &bull; Dataset release:
- Avoid duplicated organisation PIDs
### v5.1.3
_Start Date: 2023-05-22 &bull; Release Date: 2023-06-12 &bull; Dataset release: **no**_
_Start Date: 2023-05-22 &bull; Release Date: 2023-06-12 &bull; Dump release: **no**_
#### Added
- Datasource and project level usage counts
@ -182,7 +103,7 @@ _Start Date: 2023-05-22 &bull; Release Date: 2023-06-12 &bull; Dataset release:
- Deduplication of the datasource
### v5.1.2
_Start Date: 2023-03-20 &bull; Release Date: 2023-04-04 &bull; Dataset release: **no**_
_Start Date: 2023-03-20 &bull; Release Date: 2023-04-04 &bull; Dump release: **no**_
#### Changed
@ -193,15 +114,15 @@ _Start Date: 2023-03-20 &bull; Release Date: 2023-04-04 &bull; Dataset release:
- OpenCitations relations from January 2023
### v5.1.1
_Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dataset release: **no**_
_Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dump release: **no**_
#### Added
- Revised SDG classification: improved coverage (+600K classified DOIs)
- General increase of the funded scientific outputs, thanks to the full text mining scanning new OpenAccess publications
- Integrated contents from
- [EMBL-EBIs Protein Data Bank in Europe](./graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](./graph-production-workflow/aggregation/non-compatible-sources/uniprot)
- [EMBL-EBIs Protein Data Bank in Europe](/graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](/graph-production-workflow//aggregation/non-compatible-sources/uniprot)
#### Changed
@ -212,7 +133,7 @@ _Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dataset release:
- OpenCitations relations from December 2022
### v5.1.0
_Start Date: 2023-01-16 &bull; Release Date: 2023-01-30 &bull; Dataset release: **no**_
_Start Date: 2023-01-16 &bull; Release Date: 2023-01-30 &bull; Dump release: **no**_
#### Added
@ -229,18 +150,18 @@ _Start Date: 2023-01-16 &bull; Release Date: 2023-01-30 &bull; Dataset release:
### v5.0.0
_Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dataset release: **yes**_
_Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **yes**_
#### Added
- [Impact & Usage indicators](./data-model/entities/research-product.md#indicators) at the level of the research product
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [ResearchProduct.subjects](./data-model/entities/research-product#subjects)
- Measures were removed from the [ResearchProduct.instance](./data-model/entities/research-product#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

View File

@ -5,19 +5,18 @@ The OpenAIRE Graph comprises several types of [entities](../category/entities) a
The latest version of the JSON schema can be found on the [Downloads](../downloads/full-graph) section.
<p align="center">
<img loading="lazy" alt="Data model" src={require('../assets/img/data-model-3.png').default} width="80%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
<img loading="lazy" alt="Data model" src={require('../assets/img/data-model-2.png').default} width="80%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
The figure above, presents the graph's data model.
Its main entities are described in brief below:
* [Research products](./entities/research-product) represent the outcomes (or products) of research activities.
* [Data sources](./entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](./entities/organization) correspond to companies or research institutions involved in projects,
* [Results](/data-model/entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](/data-model/entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](/data-model/entities/organization) correspond to companies or research institutions involved in projects,
responsible for operating data sources or consisting the affiliations of Product creators.
* [Projects](./entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](./entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
* Persons correspond to individual researchers who are involved in the design, creation or maintenance of research products. Currently, this is a non-materialized entity type in the Graph, which means that the respective metadata (and relationships) are encapsulated in the author field of the respective research products.
* [Projects](/data-model/entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](/data-model/entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
:::note Further reading

View File

@ -64,7 +64,7 @@ The datasource type; see the vocabulary [dnet:datasource_typologies](https://api
### openairecompatibility
_Type: String &bull; Cardinality: ONE_
The OpenAIRE compatibility of the ingested research products, indicates which guidelines they are compliant according to the vocabulary [dnet:datasourceCompatibilityLevel](https://api.openaire.eu/vocabularies/dnet:datasourceCompatibilityLevel).
The OpenAIRE compatibility of the ingested results, indicates which guidelines they are compliant according to the vocabulary [dnet:datasourceCompatibilityLevel](https://api.openaire.eu/vocabularies/dnet:datasourceCompatibilityLevel).
```json
"openairecompatibility": "collected from a compatible aggregator"

View File

@ -20,7 +20,7 @@ Indicates the OpenAccess status. Values are set according to the [Unpaywall meth
```
## AlternateIdentifier
Type used to represent the information associated to persistent identifiers associated to the research product that have not been forged by an authority for that pid type. For example we collect metadata from an institutional repository that provides as identifier for the research product also the DOI.
Type used to represent the information associated to persistent identifiers associated to the result that have not been forged by an authority for that pid type. For example we collect metadata from an institutional repository that provides as identifier for the result also the doi.
### scheme
_Type: String &bull; Cardinality: ONE_
@ -63,7 +63,7 @@ The quantity of money.
## Author
Represents the research product author.
Represents the result author.
### fullname
_Type: String &bull; Cardinality: ONE_
@ -95,7 +95,7 @@ Author's family name.
### rank
_Type: String &bull; Cardinality: ONE_
Author's order in the list of authors for the given research product.
Author's order in the list of authors for the given result.
```json
"rank": 1
@ -167,7 +167,7 @@ The author's pid value in that scheme.
```
## BestAccessRight
Indicates the most open access rights \*available among the research product instances.
Indicates the most open access rights \*available among the result Instances.
\* where the openness is defined by the ordering of the access right terms in the following.
```
@ -203,17 +203,17 @@ Scheme of reference for access right code. Currently, always set to COAR access
## BipIndicator
The different citation-based impact indicators as computed by [BIP!](https://bip.imsi.athenarc.gr/).
The different impact indicators as computed by [BIP!](https://bip.imsi.athenarc.gr/).
### indicator
_Type: String &bull; Cardinality: ONE_
The name of indicator; it can be either one of:
* `influence`: it reflects the overall/total (citation-based) impact of an article in the research community at large, based on the underlying citation network (diachronically).
* `influence_alt`: it is an alternative to the "Influence" indicator, which also reflects the overall/total (citation-based) impact of an article in the research community at large, based on the underlying citation network (diachronically).
* `popularity`: it reflects the "current" (citation-based) impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
* `popularity_alt`: it is an alternative to the "Popularity" indicator, which also reflects the "current" (citation-based) impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
* `influence`: it reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
* `influence_alt`: it is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
* `popularity`: it reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
* `popularity_alt`: it is an alternative to the "Popularity" indicator, which also reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
* `impulse`: it reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
For more details on how these indicators are calculated, please refer [here](/graph-production-workflow/indicators-ingestion/impact-indicators).
@ -251,7 +251,7 @@ The actual indicator score.
```
## Container
This field has information about the conference or journal where the research product has been presented or published.
This field has information about the conference or journal where the result has been presented or published.
### name
_Type: String &bull; Cardinality: ONE_
@ -533,7 +533,7 @@ The description of the programme.
```
## Instance
An instance is one specific materialization or version of the research product. For example, you can have one research product with three instances due to deduplication:
An instance is one specific materialization or version of the result. For example, you can have one result with three instances as result of deduplication:
* one is the pre-print
* one is the post-print
@ -558,7 +558,7 @@ Maps [dc:rights](https://www.dublincore.org/specifications/dublin-core/dcmi-term
### alternateIdentifier
_Type: [AlternateIdentifier](#alternateidentifier) &bull; Cardinality: MANY_
All the identifiers associated to the research product other than the authoritative ones.
All the identifiers associated to the result other than the authoritative ones.
```json
"alternateIdentifier": [
@ -655,14 +655,14 @@ URLs to the instance. They may link to the actual full-text or to the landing pa
## Indicator
These are indicators computed for a specific OpenAIRE research product.
These are indicators computed for a specific OpenAIRE result.
Each Indicator object is composed of the following properties:
### bipIndicators
_Type: [BipIndicator](#bipindicator) &bull; Cardinality: MANY_
These indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), estimate the citation-based impact of a research product.
These impact-based indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), estimate the impact of a result.
For details about their calculation, please refer [here](/graph-production-workflow/indicators-ingestion/impact-indicators).
@ -710,7 +710,7 @@ Please refer [here](/graph-production-workflow/indicators-ingestion/usage-counts
}
```
## Language
Represents information for the language of the research product.
Represents information for the language of the result
### code
_Type: String &bull; Cardinality: ONE_
@ -775,13 +775,13 @@ Trust, expressed as a number in the range [0-1].
```
## ResultCountry
This is the country associated to the research product.
It is for the country associated to the result.
It is a subclass of [Country](#country) and extends it with provenance information.
### provenance
_Type: [Provenance](#provenance-2) &bull; Cardinality: ONE_
Indicates the reason why this country is associated to this research product.
Indicates the reason why this country is associated to this result.
```json
"provenance": {
@ -791,14 +791,14 @@ Indicates the reason why this country is associated to this research product.
```
## ResultPid
Type used to represent the information associated to persistent identifiers for the research product that have been forged by an authority for that pid type.
Type used to represent the information associated to persistent identifiers for the result that have been forged by an authority for that pid type.
<!-- <span className="todo">Seems to be similar to the AlternateIdentifier. What is the difference?</span> -->
### scheme
_Type: String &bull; Cardinality: ONE_
The scheme of the persistent identifier for the research product (i.e. doi). If the pid is here it means the information for the pid has been collected from an authority for that pid type (i.e. Crossref/Datacite for doi). The set of authoritative pid is: `doi` when collected from Crossref or Datacite, `pmid` when collected from EuroPubmed, `arxiv` when collected from arXiv, `handle` from the repositories.
The scheme of the persistent identifier for the result (i.e. doi). If the pid is here it means the information for the pid has been collected from an authority for that pid type (i.e. Crossref/Datacite for doi). The set of authoritative pid is: `doi` when collected from Crossref or Datacite, `pmid` when collected from EuroPubmed, `arxiv` when collected from arXiv, `handle` from the repositories.
```json
"scheme": "doi"
@ -814,7 +814,7 @@ The value expressed in the scheme (i.e. 10.1000/182).
```
## Subject
Represents keywords associated to the research product.
Represents keywords associated to the result.
### subject
_Type: [SubjectSchemeValue](#subjectschemevalue) &bull; Cardinality: ONE_
@ -863,12 +863,12 @@ The value for the subject in the selected scheme. When the scheme is 'keyword',
## UsageCounts
The usage counts indicator computed for this research product.
The usage counts indicator computed for this result.
### views
_Type: String &bull; Cardinality: ONE_
The number of views for this research product.
The number of views for this result.
```json
"views": "10"
@ -877,7 +877,7 @@ The number of views for this research product.
### downloads
_Type: String &bull; Cardinality: ONE_
The number of downloads for this research product.
The number of downloads for this result.
```json
"downloads": "5"

View File

@ -4,7 +4,7 @@ sidebar_position: 4
# Projects
Of crucial interest to OpenAIRE is also the identification of the funders (e.g. European Commission, WellcomeTrust, FCT Portugal, NWO The Netherlands) that co-funded the projects that have led to a given research product. Projects are characterized by a list of funding streams (e.g. FP7, H2020 for the EC), which identify the strands of fundings. Funding streams can be nested to form a tree of sub-funding streams.
Of crucial interest to OpenAIRE is also the identification of the funders (e.g. European Commission, WellcomeTrust, FCT Portugal, NWO The Netherlands) that co-funded the projects that have led to a given result. Projects are characterized by a list of funding streams (e.g. FP7, H2020 for the EC), which identify the strands of fundings. Funding streams can be nested to form a tree of sub-funding streams.
---

View File

@ -1,520 +0,0 @@
---
sidebar_position: 1
---
# Research products
Research products are intended as digital objects, described by metadata, resulting from a scientific process.
In this page, we descibe the properties of the `ResearchProduct` object.
Moreover, there are the following sub-types of a `ResearchProduct`, that inherit all its properties and further extend it:
* [Publication](#publication)
* [Dataset](#dataset)
* [Software](#software)
* [Other research product](#other-research-product)
---
## The `ResearchProduct` object
### id
_Type: String &bull; Cardinality: ONE_
Main entity identifier, created according to the [OpenAIRE entity identifier and PID mapping policy](../pids-and-identifiers).
```json
"id": "doi_dedup___::80f29c8c8ba18c46c88a285b7e739dc3"
```
### type
_Type: String &bull; Cardinality: ONE_
Type of the research products. Possible types:
* `publication`
* `dataset`
* `software`
* `other`
as declared in the terms from the [dnet:result_typologies vocabulary](https://api.openaire.eu/vocabularies/dnet:result_typologies).
```json
"type": "publication"
```
### originalId
_Type: String &bull; Cardinality: MANY_
Identifiers of the record at the original sources.
```json
"originalId": [
"oai:pubmedcentral.nih.gov:8024784",
"S0048733321000305",
"10.1016/j.respol.2021.104226",
"3136742816"
]
```
### maintitle
_Type: String &bull; Cardinality: ONE_
A name or title by which a research product is known. May be the title of a publication, of a dataset or the name of a piece of software.
```json
"maintitle": "The fall of the innovation empire and its possible rise through open science"
```
### subtitle
_Type: String &bull; Cardinality: ONE_
Explanatory or alternative name by which a research product is known.
```json
"subtitle": "An analysis of cases from 1980 - 2020"
```
### author
_Type: [Author](other#author) &bull; Cardinality: MANY_
The main researchers involved in producing the data, or the authors of the publication.
```json
"author": [
{
"fullname": "E. Richard Gold",
"rank": 1,
"name": "Richard",
"surname": "Gold",
"pid": {
"id": {
"scheme": "orcid",
"value": "0000-0002-3789-9238"
},
"provenance"; {
"provenance": "Harvested",
"trust": "0.9"
}
}
},
...
]
```
### bestaccessright
_Type: [BestAccessRight](other#bestaccessright) &bull; Cardinality: ONE_
The most open access right associated to the manifestations of this research product.
```json
"bestaccessright": {
"code": "c_abf2",
"label": "OPEN",
"scheme": "http://vocabularies.coar-repositories.org/documentation/access_rights/"
}
```
### contributor
_Type: String &bull; Cardinality: MANY_
The institution or person responsible for collecting, managing, distributing, or otherwise contributing to the development of the resource.
```json
"contributor": [
"University of Zurich",
"Wright, Aidan G C",
"Hallquist, Michael",
...
]
```
### country
_Type: [ResultCountry](other#resultcountry) &bull; Cardinality: MANY_
Country associated with the research product: it is the country of the organisation that manages the institutional repository or national aggregator or CRIS system from which this record was collected.
Country of affiliations of authors can be found instead in the affiliation relation.
```json
"country": [
{
"code": "CH",
"label": "Switzerland",
"provenance": {
"provenance": "Inferred by OpenAIRE",
"trust": "0.85"
}
},
...
]
```
### coverage
_Type: String &bull; Cardinality: MANY_
### dateofcollection
_Type: String &bull; Cardinality: ONE_
When OpenAIRE collected the record the last time.
```json
"dateofcollection": "2021-06-09T11:37:56.248Z"
```
### description
_Type: String &bull; Cardinality: MANY_
A brief description of the resource and the context in which the resource was created.
```json
"description": [
"Open science partnerships (OSPs) are one mechanism to reverse declining efficiency. OSPs are public-private partnerships that openly share publications, data and materials.",
"There is growing concern that the innovation system's ability to create wealth and attain social benefit is declining in effectiveness. This article explores the reasons for this decline and suggests a structure, the open science partnership, as one mechanism through which to slow down or reverse this decline.",
"The article examines the empirical literature of the last century to document the decline. This literature suggests that the cost of research and innovation is increasing exponentially, that researcher productivity is declining, and, third, that these two phenomena have led to an overall flat or declining level of innovation productivity.",
...
]
```
### embargoenddate
_Type: String &bull; Cardinality: ONE_
Date when the embargo ends and this research product turns Open Access.
```json
"embargoenddate": "2017-01-01"
```
### indicators
_Type: [Indicator](other#indicator-1) &bull; Cardinality: ONE_
The indicators computed for this research product;
currently, the following types of indicators are supported:
* [Citation-based impact indicators by BIP!](other#bipindicators)
* [Usage Statistics indicators](other#usagecounts)
```json
"indicators": {
"bipIndicators": [
{
"indicator": "influence",
"score": "123",
"class": "C2"
},
{
"indicator": "influence_alt",
"score": "456",
"class": "C3"
},
{
"indicator": "popularity",
"score": "234",
"class": "C1"
},
{
"indicator": "popularity_alt",
"score": "345",
"class": "C5"
},
{
"indicator": "impulse",
"score": "987",
"class": "C3"
}
],
"usageCounts": {
"downloads": "10",
"views": "20"
}
}
```
### instance
_Type: [Instance](other#instance) &bull; Cardinality: MANY_
Specific materialization or version of the research product. For example, you can have one research product with three instances: one is the pre-print, one is the post-print, one is the published version.
```json
"instance": [
{
"accessright": {
"code": "c_abf2",
"label": "OPEN",
"openAccessRoute": "gold",
"scheme": "http://vocabularies.coar-repositories.org/documentation/access_rights/"
},
"alternateIdentifier": [
{
"scheme": "doi",
"value": "10.1016/j.respol.2021.104226"
},
...
],
"articleprocessingcharge": {
"amount": "4063.93",
"currency": "EUR"
},
"license": "http://creativecommons.org/licenses/by-nc/4.0",
"pid": [
{
"scheme": "pmc",
"value": "PMC8024784"
},
...
],
"publicationdate": "2021-01-01",
"refereed": "UNKNOWN",
"type": "Article",
"url": [
"http://europepmc.org/articles/PMC8024784"
]
},
...
]
```
### language
_Type: [Language](other#language) &bull; Cardinality: ONE_
The alpha-3/ISO 639-2 code of the language. Values controlled by the [dnet:languages vocabulary](https://api.openaire.eu/vocabularies/dnet:languages).
```json
"language": {
"code": "eng",
"label": "English"
}
```
### lastupdatetimestamp
_Type: Long &bull; Cardinality: ONE_
Timestamp of last update of the record in OpenAIRE.
```json
"lastupdatetimestamp": 1652722279987
```
### pid
_Type: [ResultPid](other#resultpid) &bull; Cardinality: MANY_
Persistent identifiers of the research product. See also the [OpenAIRE entity identifier and PID mapping policy](../pids-and-identifiers) to learn more.
```json
"pid": [
{
"scheme": "pmc",
"value": "PMC8024784"
},
{
"scheme": "doi",
"value": "10.1016/j.respol.2021.104226"
},
...
]
```
### publicationdate
_Type: String &bull; Cardinality: ONE_
Main date of the research product: typically the publication or issued date. In case of a research product with different versions with different dates, the date of the research product is selected as the most frequent well-formatted date. If not available, then the most recent and complete date among those that are well-formatted. For statistics, the year is extracted and the research product is counted only among the research products of that year. Example: Pre-print date: 2019-02-03, Article date provided by repository: 2020-02, Article date provided by Crossref: 2020, OpenAIRE will set as date 2019-02-03, because its the most recent among the complete and well-formed dates. If then the repository updates the metadata and set a complete date (e.g. 2020-02-12), then this will be the new date for the research product because it becomes the most recent most complete date. However, if OpenAIRE then collects the pre-print from another repository with date 2019-02-03, then this will be the “winning date” because it becomes the most frequent well-formatted date.
```json
"publicationdate": "2021-03-18"
```
### publisher
_Type: String &bull; Cardinality: ONE_
The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource.
```json
"publisher": "Elsevier, North-Holland Pub. Co"
```
### source
_Type: String &bull; Cardinality: MANY_
A related resource from which the described resource is derived. See definition of Dublin Core field [dc:source](https://www.dublincore.org/specifications/dublin-core/dcmi-terms/elements11/source).
```json
"source": [
"Research Policy",
"Crossref",
...
]
```
### subjects
_Type: [Subject](other#subject) &bull; Cardinality: MANY_
Subject, keyword, classification code, or key phrase describing the resource.
```json
"subjects": [
{
"provenance": {
"provenance": "Harvested",
"trust": "0.9"
},
"subject": {
"scheme": "keyword",
"value": "Open science"
}
},
...
]
```
### isGreen
_Type: Boolean &bull; Cardinality: ONE_
Indicates whether or not the scientific result was published following the green open access model.
### openAccessColor
_Type: String &bull; Cardinality: ONE_
Indicates the specific open access model used for the publication; possible value is one of `bronze, gold, hybrid`.
### isInDiamondJournal
_Type: Boolean &bull; Cardinality: ONE_
Indicates whether or not the publication was published in a diamond journal.
### publiclyFunded
_Type: String &bull; Cardinality: ONE_
Discloses whether the publication acknowledges grants from public sources.
---
## Sub-types
There are the following sub-types of `Result`. Each inherits all its fields and extends them with the following.
### Publication
Metadata records about research literature (includes types of publications listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/publication)).
#### container
_Type: [Container](other#container) &bull; Cardinality: ONE_
Container has information about the conference or journal where the research product has been presented or published.
```json
"container": {
"edition": "",
"iss": "5",
"issnLinking": "",
"issnOnline": "1873-7625",
"issnPrinted": "0048-7333",
"name": "Research Policy",
"sp": "12",
"ep": "22",
"vol": "50"
}
```
### Dataset
Metadata records about research data (includes the subtypes listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/dataset)).
#### size
_Type: String &bull; Cardinality: ONE_
The declared size of the dataset.
```json
"size": "10129818"
```
#### version
_Type: String &bull; Cardinality: ONE_
The version of the dataset.
```json
"version": "v1.3"
```
#### geolocation
_Type: [GeoLocation](other#geolocation) &bull; Cardinality: MANY_
The list of geolocations associated with the dataset.
```json
"geolocation": [
{
"box": "18.569386 54.468973 18.066832 54.83707",
"place": "Tübingen, Baden-Württemberg, Southern Germany",
"point": "7.72486 50.1084"
},
...
]
```
### Software
Metadata records about research software (includes the subtypes listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/software)).
#### documentationUrl
_Type: String &bull; Cardinality: MANY_
The URLs to the software documentation.
```json
"documentationUrl": [
"https://github.com/openaire/iis/blob/master/README.markdown",
...
]
```
#### codeRepositoryUrl
_Type: String &bull; Cardinality: ONE_
The URL to the repository with the source code.
```json
"codeRepositoryUrl": "https://github.com/openaire/iis"
```
#### programmingLanguage
_Type: String &bull; Cardinality: ONE_
The programming language.
```json
"programmingLanguage": "Java"
```
### Other research product
Metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed [here](http://api.openaire.eu/vocabularies/dnet:result_typologies/other)).
#### contactperson
_Type: String &bull; Cardinality: MANY_
Information on the person responsible for providing further information regarding the resource.
```json
"contactperson": [
"Noémie Dominguez",
...
]
```
#### contactgroup
_Type: String &bull; Cardinality: MANY_
Information on the group responsible for providing further information regarding the resource.
```json
"contactgroup": [
"Networked Multimedia Information Systems (NeMIS)",
...
]
```
#### tool
_Type: String &bull; Cardinality: MANY_
Information about tool useful for the interpretation and/or re-use of the research product.

View File

@ -2,12 +2,12 @@
sidebar_position: 1
---
# Research products
# Results
Research products are intended as digital objects, described by metadata, resulting from a scientific process.
In this page, we descibe the properties of the `ResearchProduct` object.
Results are intended as digital objects, described by metadata, resulting from a scientific process.
In this page, we descibe the properties of the `Result` object.
Moreover, there are the following sub-types of a `ResearchProduct`, that inherit all its properties and further extend it:
Moreover, there are the following sub-types of a `Result`, that inherit all its properties and further extend it:
* [Publication](#publication)
* [Dataset](#dataset)
* [Software](#software)
@ -15,7 +15,7 @@ Moreover, there are the following sub-types of a `ResearchProduct`, that inherit
---
## The `ResearchProduct` object
## The `Result` object
### id
_Type: String &bull; Cardinality: ONE_
@ -29,7 +29,7 @@ Main entity identifier, created according to the [OpenAIRE entity identifier and
### type
_Type: String &bull; Cardinality: ONE_
Type of the research products. Possible types:
Type of the result. Possible types:
* `publication`
* `dataset`
@ -59,7 +59,7 @@ Identifiers of the record at the original sources.
### maintitle
_Type: String &bull; Cardinality: ONE_
A name or title by which a research product is known. May be the title of a publication, of a dataset or the name of a piece of software.
A name or title by which a scientific result is known. May be the title of a publication, of a dataset or the name of a piece of software.
```json
"maintitle": "The fall of the innovation empire and its possible rise through open science"
@ -69,7 +69,7 @@ A name or title by which a research product is known. May be the title of a publ
_Type: String &bull; Cardinality: ONE_
Explanatory or alternative name by which a research product is known.
Explanatory or alternative name by which a scientific result is known.
```json
"subtitle": "An analysis of cases from 1980 - 2020"
@ -104,7 +104,7 @@ The main researchers involved in producing the data, or the authors of the publi
### bestaccessright
_Type: [BestAccessRight](other#bestaccessright) &bull; Cardinality: ONE_
The most open access right associated to the manifestations of this research product.
The most open access right associated to the manifestations of this research results.
```json
"bestaccessright": {
@ -131,8 +131,8 @@ The institution or person responsible for collecting, managing, distributing, or
### country
_Type: [ResultCountry](other#resultcountry) &bull; Cardinality: MANY_
Country associated with the research product: it is the country of the organisation that manages the institutional repository or national aggregator or CRIS system from which this record was collected.
Country of affiliations of authors can be found instead in the affiliation relation.
Country associated with the result because it is the country of the organisation that manages the institutional repository or national aggregator or CRIS system from which this record was collected
Country of affiliations of authors can be found instead in the affiliation rel.
```json
"country": [
@ -177,7 +177,7 @@ A brief description of the resource and the context in which the resource was cr
### embargoenddate
_Type: String &bull; Cardinality: ONE_
Date when the embargo ends and this research product turns Open Access.
Date when the embargo ends and this result turns Open Access.
```json
"embargoenddate": "2017-01-01"
@ -186,7 +186,7 @@ Date when the embargo ends and this research product turns Open Access.
### indicators
_Type: [Indicator](other#indicator-1) &bull; Cardinality: ONE_
The indicators computed for this research product;
The indicators computed for this result;
currently, the following types of indicators are supported:
* [Impact indicators by BIP!](other#bipindicators)
@ -231,7 +231,7 @@ currently, the following types of indicators are supported:
### instance
_Type: [Instance](other#instance) &bull; Cardinality: MANY_
Specific materialization or version of the research product. For example, you can have one research product with three instances: one is the pre-print, one is the post-print, one is the published version.
Specific materialization or version of the result. For example, you can have one result with three instances: one is the pre-print, one is the post-print, one is the published version.
```json
"instance": [
@ -296,7 +296,7 @@ Timestamp of last update of the record in OpenAIRE.
### pid
_Type: [ResultPid](other#resultpid) &bull; Cardinality: MANY_
Persistent identifiers of the research product. See also the [OpenAIRE entity identifier and PID mapping policy](../pids-and-identifiers) to learn more.
Persistent identifiers of the result. See also the [OpenAIRE entity identifier and PID mapping policy](../pids-and-identifiers) to learn more.
```json
"pid": [
@ -315,7 +315,7 @@ Persistent identifiers of the research product. See also the [OpenAIRE entity id
### publicationdate
_Type: String &bull; Cardinality: ONE_
Main date of the research product: typically the publication or issued date. In case of a research product with different versions with different dates, the date of the research product is selected as the most frequent well-formatted date. If not available, then the most recent and complete date among those that are well-formatted. For statistics, the year is extracted and the research product is counted only among the research products of that year. Example: Pre-print date: 2019-02-03, Article date provided by repository: 2020-02, Article date provided by Crossref: 2020, OpenAIRE will set as date 2019-02-03, because its the most recent among the complete and well-formed dates. If then the repository updates the metadata and set a complete date (e.g. 2020-02-12), then this will be the new date for the research product because it becomes the most recent most complete date. However, if OpenAIRE then collects the pre-print from another repository with date 2019-02-03, then this will be the “winning date” because it becomes the most frequent well-formatted date.
Main date of the research product: typically the publication or issued date. In case of a research result with different versions with different dates, the date of the result is selected as the most frequent well-formatted date. If not available, then the most recent and complete date among those that are well-formatted. For statistics, the year is extracted and the result is counted only among the result of that year. Example: Pre-print date: 2019-02-03, Article date provided by repository: 2020-02, Article date provided by Crossref: 2020, OpenAIRE will set as date 2019-02-03, because its the most recent among the complete and well-formed dates. If then the repository updates the metadata and set a complete date (e.g. 2020-02-12), then this will be the new date for the result because it becomes the most recent most complete date. However, if OpenAIRE then collects the pre-print from another repository with date 2019-02-03, then this will be the “winning date” because it becomes the most frequent well-formatted date.
```json
"publicationdate": "2021-03-18"
@ -363,28 +363,6 @@ Subject, keyword, classification code, or key phrase describing the resource.
...
]
```
### isGreen
_Type: Boolean &bull; Cardinality: ONE_
Indicates whether or not the scientific result was published following the green open access model.
### openAccessColor
_Type: String &bull; Cardinality: ONE_
Indicates the specific open access model used for the publication; possible value is one of `bronze, gold, hybrid`.
### isInDiamondJournal
_Type: Boolean &bull; Cardinality: ONE_
Indicates whether or not the publication was published in a diamond journal.
### publiclyFunded
_Type: String &bull; Cardinality: ONE_
Discloses whether the publication acknowledges grants from public sources.
---
## Sub-types
@ -398,7 +376,7 @@ Metadata records about research literature (includes types of publications liste
#### container
_Type: [Container](other#container) &bull; Cardinality: ONE_
Container has information about the conference or journal where the research product has been presented or published.
Container has information about the conference or journal where the result has been presented or published.
```json
"container": {
@ -517,4 +495,3 @@ Information on the group responsible for providing further information regarding
_Type: String &bull; Cardinality: MANY_
Information about tool useful for the interpretation and/or re-use of the research product.

View File

@ -35,10 +35,10 @@ assigns PIDs to their scientific products from a given PID minter.
This "selection" can be performed when the entities in the graph sharing the same identifier are grouped together. The list of the delegated authorities currently includes
| Datasource delegated | Datasource delegating | Pid Type |
|--------------------------------------|----------------------------------|----------|
| [Zenodo](https://zenodo.org) | [Datacite](https://datacite.org) | doi |
| [RoHub](https://reliance.rohub.org/) | [W3ID](https://w3id.org/) | w3id |
| Datasource delegated | Datasource delegating | Pid Type |
|--------------------------------------|----------------------------------|-----------|
| [Zenodo](https://zenodo.org) | [Datacite](https://datacite.org) | doi |
| [RoHub](https://reliance.rohub.org/) | [W3ID](https://w3id.org/) | w3id |
## Identifiers in the Graph
@ -66,16 +66,16 @@ When the record is collected from a source which is not authoritative for any ty
Currently, the following data sources are used as "PID authorities":
| PID Type | Prefix (12 chars) | Authority |
|----------|-----------------------|-----------------------------------------|
| doi | `doi_________` | Crossref, Datacite, Zenodo |
| pmc | `pmc_________` | Europe PubMed Central, PubMed Central |
| pmid | `pmid________` | Europe PubMed Central, PubMed Central |
| arXiv | `arXiv_______` | arXiv.org e-Print Archive |
| handle | `handle______` | any repository |
| ena | `ena_________` | EMBL-EBI |
| pdb | `pdb_________` | EMBL-EBI |
| uniprot | `uniprot_____` | EMBL-EBI |
| PID Type | Prefix (12 chars) | Authority |
|-----------|------------------------|-------------------------------------------|
| doi | `doi_________` | Crossref, Datacite, Zenodo |
| pmc | `pmc_________` | Europe PubMed Central, PubMed Central |
| pmid | `pmid________` | Europe PubMed Central, PubMed Central |
| arXiv | `arXiv_______` | arXiv.org e-Print Archive |
| handle | `handle______` | any repository |
| ena | `ena_________` | EMBL-EBI |
| pdb | `pdb_________` | EMBL-EBI |
| uniprot | `uniprot_____` | EMBL-EBI |
OpenAIRE also perform duplicate identification (see the [dedicated section for details](/graph-production-workflow/deduplication)).
All duplicates are **merged** together in a **representative record** which must be assigned a [dedicated OpenAIRE identifier](/graph-production-workflow/deduplication/research-products#openaire-identifier-of-the-representative-record) (i.e. it cannot have the identifier of one of the aggregated record).
All duplicates are **merged** together in a **representative record** which must be assigned a dedicated OpenAIRE identifier (i.e. it cannot have the identifier of one of the aggregated record).

View File

@ -1,36 +1,36 @@
# Relationship types
The following table lists all the possible relation semantics found in the Graph Dataset.
The following table lists all the possible relation semantics found in the graph dump.
Note: the labels used to specify the semantic of the relationships are (for the large) inherited from the [DataCite metadata kernel](https://schema.datacite.org/meta/kernel-4.4/doc/DataCite-MetadataKernel_v4.4.pdf), which provides a description for them.
| # | Source entity type | Target entity type | Relation name / inverse | Provenance |
|:--:|:--------------------------------------:|:--------------------------------------:|:----------------------------------------------------------:|:-----------------------------------------------:|
| 1 | [Project](/data-model/entities/project) | [ResearchProduct](../../data-model/entities/research-product) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 1 | [Project](/data-model/entities/project) | [Result](/data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](/data-model/entities/project) | [Organization](/data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](/data-model/entities/project) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsPartOf / HasPart | Harvested |
| 8 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsDocumentedBy / Documents | Harvested |
| 9 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsCompiledBy / Compiles | Harvested |
| 12 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsRequiredBy / Requires | Harvested |
| 13 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsReferencedBy / References | Harvested |
| 15 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsReviewedBy / Reviews | Harvested |
| 16 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsVersionOf / HasVersion | Harvested |
| 18 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsContinuedBy / Continues | Harvested |
| 21 | [ResearchProduct](../../data-model/entities/research-product) | [ResearchProduct](../../data-model/entities/research-product) | IsDescribedBy / Describes | Harvested |
| 22 | [ResearchProduct](../../data-model/entities/research-product) | [Organization](/data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [ResearchProduct](../../data-model/entities/research-product) | [Data source](/data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [ResearchProduct](../../data-model/entities/research-product) | [Data source](/data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [ResearchProduct](../../data-model/entities/research-product) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 4 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](/data-model/entities/result) | [Organization](/data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](/data-model/entities/result) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](/data-model/entities/organization) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](/data-model/entities/organization) | [Organization](/data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](/data-model/entities/data-source) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |

View File

@ -6,7 +6,7 @@ sidebar_position: 1
# CommunityInstance
It is a subclass of [Instance](../../data-model/entities/research-product#instance) extended with information regarding the collection and hosting source for this materialization of the research product.
It is a subclass of [Instance](../../data-model/entities/result#instance) extended with information regarding the collection and hosting source for this materialization of the result.
### hostedby
_Type: [CfHbKeyValue](./cfhb) &bull; Cardinality: ONE_

View File

@ -6,7 +6,7 @@ sidebar_position: 1
# Context
Information related to research initiative/community (RI/RC) related to the research product.
Information related to research initiative/community (RI/RC) related to the result.
### code
_Type: String &bull; Cardinality: ONE_
@ -31,7 +31,7 @@ Label of the RI/RC.
### provenance
_Type: [Provenance](/data-model/entities/other#provenance-2) &bull; Cardinality: MANY_
Why this research product is associated to the RI/RC.
Why this result is associated to the RI/RC.
```json

View File

@ -1,140 +0,0 @@
---
sidebar_position: 1
---
# Extended Research Product
It is a subclass of [ResearchProduct](../../data-model/entities/research-product) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
### projects
_Type: [Project](project.md) &bull; Cardinality: MANY_
List of projects (i.e. grants) that (co-)funded the production of the research products.
```json
"projects": [
{
"id": "corda__h2020::94c4a066401e22002c4811a301bb4655",
"code": "727929",
"acronym": "TomRes",
"title": "A NOVEL AND INTEGRATED APPROACH TO INCREASE MULTIPLE AND COMBINED STRESS TOLERANCE IN PLANTS USING TOMATO AS A MODEL",
"funder": {
"shortName": "EC",
"name": "European Commission",
"jurisdiction": "EU",
"fundingStream": "H2020"
},
"provenance": {
"provenance": "Harvested",
"trust": "0.900000000000000022"
},
"validated": {
"validationDate": "2021-0101",
"validatedByFunder": true
}
},
...
]
```
### context
_Type: [Context](./context) &bull; Cardinality: MANY_
Reference to relevant research infrastructure, initiative or communities (RI/RC) among those collaborating with OpenAIRE. Please see https://connect.openaire.eu that are publicly visible.
```json
"context":[
{
"code":"sdsn-gr",
"label":"SDSN - Greece",
"provenance":[
{
"provenance":"Inferred by OpenAIRE",
"trust":"0.9"
}
]
},
...
]
```
### collectedfrom
_Type: [CfHbKeyValue](./cfhb) &bull; Cardinality: MANY_
Information about the sources from which the record has been collected.
```json
"collectedfrom":[
{
"key":"openaire____::081b82f96300b6a6e3d282bad31cb6e2",
"value":"Crossref"
},
...
]
```
### instance
_Type: [CommunityInstance](./communityInstance) &bull; Cardinality: MANY_
Information about the source from which the instance can be viewed or downloaded.
```json
"instance": [
{
"license": "http://doi.wiley.com/10.1002/tdm_license_1.1",
"accessright": {
"code": "c_16ec",
"label": "RESTRICTED",
"scheme": "http://vocabularies.coar-repositories.org/documentation/access_rights/",
"openAccessRoute": null
},
"type": "Article",
"url": [
"https://api.wiley.com/onlinelibrary/tdm/v1/articles/10.1111%2Fnph.15014",
"http://onlinelibrary.wiley.com/wol1/doi/10.1111/nph.15014/fullpdf",
"http://dx.doi.org/10.1111/nph.15014"
],
"publicationdate": "2018-02-09",
"refereed": "UNKNOWN",
"hostedby": {
"key": "issn___print::35ee75a5ad42581d604be113a8f56427",
"value": "New Phytologist"
},
"collectedfrom": {
"key": "openaire____::081b82f96300b6a6e3d282bad31cb6e2",
"value": "Crossref"
}
},
...
]
```

View File

@ -5,10 +5,11 @@ sidebar_position: 1
---
# Extended Research Product
# Extended Result
It is a subclass of [ResearchProduct](../../data-model/entities/research-product) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
### projects
@ -16,7 +17,7 @@ It is a subclass of [ResearchProduct](../../data-model/entities/research-product
_Type: [Project](project.md) &bull; Cardinality: MANY_
List of projects (i.e. grants) that (co-)funded the production of the research products.
List of projects (i.e. grants) that (co-)funded the production of the research results.
```json

View File

@ -9,7 +9,7 @@ sidebar_position: 1
# Project
The information about the projects related to a research product.
The information about the projects related to the result.
### id
@ -99,7 +99,7 @@ Information about the funder funding the project.
_Type [Provenance](../../data-model/entities/other#provenance-2) &bull; Cardinality: ONE_
The reason why the project is associated to the research product.
The reason why the project is associated to the result.
```json
@ -119,7 +119,7 @@ The reason why the project is associated to the research product.
_Type [Validated](validated.md) &bull; Cardinality: ONE_
Specifies whether the association between the project and the research product was validated.
Specifies it the association between the project and the result was validated.
```json

View File

@ -7,7 +7,7 @@ sidebar_position: 1
# Validated
Information about the validtion of the association between the research product and the funding information.
Information about the validtion of the association between the result and the funding information.
### validationDate
@ -15,7 +15,7 @@ Information about the validtion of the association between the research product
_Type: String &bull; Cardinality: ONE_
When OpenAIRE collected the association between the funding and the research product from an authoritative source (i.e. Sygma).
When OpenAIRE collected the association between the funding and the result from an authoritative source (i.e. Sygma).
```json

View File

@ -4,13 +4,13 @@ sidebar_position: 2
# Beginner's kit
:::caution
This version is not accompanied with public dump files, hence the files in this section are based on [v6.0.0](/docs/6.0.0/) of the Graph. The current data are only exposed via the [OpenAIRE Graph API](https://graph.openaire.eu/develop/) and added-value services that are built on top of this version of the Graph (e.g., the [OpenAIRE Explore](https://explore.openaire.eu/)). If you are interested to get bulk access to our latest data, please contact us via our [helpdesk](https://graph.openaire.eu/support).
:::
The large size of the OpenAIRE Graph is a major impediment for beginners to familiarise with the underlying data model and explore its contents.
Working with the Graph in its full size typically requires access to a huge distributed computing infrastructure which cannot be easily accessible to everyone.
[The OpenAIRE Beginners Kit](https://doi.org/10.5281/zenodo.7490191) aims to address this issue. It consists of two components:
<!-- :::caution
This version is not accompanied with public dataset files, hence the files in this section are based on [v6.0.0](/docs/6.0.0/) of the Graph. The current data are only exposed via the [OpenAIRE Graph API](https://graph.openaire.eu/develop/) and added-value services that are built on top of this version of the Graph (e.g., the [OpenAIRE Explore](https://explore.openaire.eu/)). If you are interested to get bulk access to our latest data, please contact us via our [helpdesk](https://graph.openaire.eu/support).
::: -->
* A subset of the Graph composed of the research products published between 2022-06-29 and 2022-12-29, all the entities connected to them and the respective relationships.
* A Zeppelin notebook that demonstrates how you can use PySpark to analyse the Graph and get answers to some interesting research questions. A guide to Apache Zeppelin can be found [here](https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_zeppelin-component-guide/content/ch_overview.html).

View File

@ -4,10 +4,12 @@ sidebar_position: 1
# Full graph dataset
You can download the full OpenAIRE Graph Dataset as well as its schema from the following links:
<!-- :::caution
:::caution
This version is not accompanied with public dump files, hence the files in this section are based on [v6.0.0](/docs/6.0.0/) of the Graph. The current data are only exposed via the [OpenAIRE Graph API](https://graph.openaire.eu/develop/) and added-value services that are built on top of this version of the Graph (e.g., the [OpenAIRE Explore](https://explore.openaire.eu/)). If you are interested to get bulk access to our latest data, please contact us via our [helpdesk](https://graph.openaire.eu/support).
::: -->
:::
You can download the full OpenAIRE Graph Dataset as well as its schema from the following links:
Dataset: https://doi.org/10.5281/zenodo.3516917
Schema: https://doi.org/10.5281/zenodo.4238938

View File

@ -19,16 +19,12 @@ The dataset contains the GZ-compressed dataset of the Scholix links exposed by t
## The OpenAIRE LOD dataset
:::caution
The OpenAIRE LOD dataset has been discontinued. The SPARQL Endpoint is no longer supported but old LOD datasets can be found in the link below.
:::
Dataset (RDF): https://doi.org/10.5281/zenodo.609943
<!-- LOD Ontology: http://lod.openaire.eu/vocab
LOD Ontology: http://lod.openaire.eu/vocab
SPARQL Endpoint: http://lod.openaire.eu/sparql -->
SPARQL Endpoint: http://lod.openaire.eu/sparql
The OpenAIRE Linked Open Data (LOD) Services and their integration with the OpenAIRE information space have been released as a beta version. The LOD exporting process started with a specification of the OpenAIRE data model as an RDF vocabulary, and then mapping of the OpenAIRE data to the graph-based RDF data model. To interlink the OpenAIRE data with related data on the Web, we have identified a list of potential datasets to interlinked with, including the DBpedia dataset extracted from Wikipedia and the publication databases DBLP and CiteSeer.
<!-- Please refer [here](http://lod.openaire.eu/documentation) for more details on the LOD documentation. -->
Please refer [here](http://lod.openaire.eu/documentation) for more details on the LOD documentation.

View File

@ -4,11 +4,13 @@ sidebar_position: 3
# Sub-graph datasets
:::caution
This version is not accompanied with public dump files, hence the files in this section are based on [v6.0.0](/docs/6.0.0/) of the Graph. The current data are only exposed via the [OpenAIRE Graph API](https://graph.openaire.eu/develop/) and added-value services that are built on top of this version of the Graph (e.g., the [OpenAIRE Explore](https://explore.openaire.eu/)). If you are interested to get bulk access to our latest data, please contact us via our [helpdesk](https://graph.openaire.eu/support).
:::
In order to facilitate users, different datasets are available under the Zenodo community called [OpenAIRE Graph](https://zenodo.org/communities/openaire-research-graph).
This page lists all alternative datasets currently available.
<!-- :::caution
This version is not accompanied with public dataset files, hence the files in this section are based on [v6.0.0](/docs/6.0.0/) of the Graph. The current data are only exposed via the [OpenAIRE Graph API](https://graph.openaire.eu/develop/) and added-value services that are built on top of this version of the Graph (e.g., the [OpenAIRE Explore](https://explore.openaire.eu/)). If you are interested to get bulk access to our latest data, please contact us via our [helpdesk](https://graph.openaire.eu/support).
::: -->
## The OpenAIRE COVID-19 dataset
@ -61,10 +63,10 @@ Please refer [here](#alternative-sub-graph-data-model) for details on the data m
It should be noted that the datasets for research communities, infrastructures, and products related to projects do not strictly follow the main data model of the OpenAIRE Graph. In particular, they differ in the following:
* only research products are included (no relations or other entities)
* the research products are extended with information that can be inferred in the whole dataset namely:
* only research products are included (no relations, and entities different from results)
* the results are extended with information that can be inferred in the whole dataset namely:
* funding information if present
* associated research community/infrastructure
* associated data sources
So they have just one entity type, that is the [Extended Research Product](./alternative-model/extended-research-product.md).
So they have just one entity type, that is the [Extended Result](alternative-model/extendedresult.md).

View File

@ -38,7 +38,7 @@ Objects and relationships in the OpenAIRE Graph are extracted from information p
- *Hybrid repositories/archives*: information systems where scientists deposit metadata and file of any kind of scientific products, incuding scientific literature, research data and research software (e.g. Zenodo)
- *Aggregator services*: Information systems that collect descriptive metadata about publications or datasets from multiple sources in order to enable cross-data source discovery of given research products. Examples are DataCite, BASE, DOAJ;
- *Entity Registries*: Information systems created with the intent of maintaining authoritative registries of given entities in the scholarly communication, such as OpenDOAR for the institutional repositories, re3data for the data repositories, CORDA and other funder databases for projects and funding information;
- *CRIS*: Information systems adopted by research and academic organizations to keep track of their research administration records and relative research products; examples of CRIS content are articles or datasets funded by projects, their principal investigators, facilities acquired thanks to funding, etc..
- *CRIS*: Information systems adopted by research and academic organizations to keep track of their research administration records and relative results; examples of CRIS content are articles or datasets funded by projects, their principal investigators, facilities acquired thanks to funding, etc..
- *Research Graphs*: services that maintain an information space of (possibly interlinked) scholalrly communication objects. Examples are CrossRef, ScholeXplorer and OpenAIRE itself.
## How does OpenAIRE collect metadata records?

View File

@ -35,7 +35,7 @@ The metadata collection process identifies the most recent record date available
The table below describes the mapping from the XML baseline records to the OpenAIRE Graph dump format.
| OpenAIRE Research Product field path | Datacite record JSON path | # Notes |
| OpenAIRE Result field path | Datacite record JSON path | # Notes |
|--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `id` | `\attributes\doi` | id in the form `doi_________::md5(doi)` |
| <ul><li>`instance`</li> <li>`instance.type`</li></ul> | <ul><li>`\attributes\types\resourceType`</li> <li> `\attributes\types\resourceTypeGeneral` </li> <li>`attributes\types\schemaOrg`</li></ul> | Use the vocabulary **_dnet:publication_resource_** to find a synonym to one of these terms and get the `instance.type`. |
@ -69,9 +69,9 @@ The table below describes the mapping from the XML baseline records to the OpenA
| OpenAIRE Relation Semantic and inverse | Datacite record JSON path | Source/Target type | #Notes |
|----------------------------------------|---------------------------------------|---------------------|------------------------------------------------------------------------------------------------------------|
| `isProducedBy/produces` | `attributes\fundingReferences` | `ResearchProduct/Project` | only when the fundingReferences matches the pattern `(info:eu-repo/grantagreement/ec/h2020/)(\d{6})(.*)` |
| `IsProvidedBy/provides` | | `ResearchProduct/Datasource` | Datasource is always set to `Datacite` |
| `isHostedBy/host` | `\attributes\relationships\client\id` | `ResearchProduct/Datasource` | we defined a curated map clientId/Datasource if we found a match we create an _hostedBy Relation_ |
| `isRelatedTo` | `\attribute\relatedIdentifiers` | `ResearchProduct/ResearchProduct` | we create relationships whenever the pid of the target is resolved on the Research Graph |
| `isProducedBy/produces` | `attributes\fundingReferences` | `result/project` | only when the fundingReferences matches the pattern `(info:eu-repo/grantagreement/ec/h2020/)(\d{6})(.*)` |
| `IsProvidedBy/provides` | | `result/datasource` | Datasource is always set to `Datacite` |
| `isHostedBy/host` | `\attributes\relationships\client\id` | `result/datasource` | we defined a curated map clientId/Datasource if we found a match we create an _hostedBy Relation_ |
| `isRelatedTo` | `\attribute\relatedIdentifiers` | `result/result` | we create relationships whenever the pid of the target is resolved on the Research Graph |

View File

@ -10,7 +10,7 @@ Each Crossref record is enriched with:
* the following information from MAG:
* abstracts
* MAG identifiers of authors
* affiliation (research product - organization) relationships
* affiliation (result - organization) relationships
* subjects (MAG FieldsOfStudy)
* conference or journal information
@ -66,13 +66,13 @@ Records in Crossref are ruled out according to the following criteria
* `10.7554/elife.21052.049`
* `10.1371/journal.pcbi.1005379.s006`
Records with `type=dataset` are mapped into OpenAIRE research products of type dataset. All others are mapped as OpenAIRE research products of type publication.
Records with `type=dataset` are mapped into OpenAIRE results of type dataset. All others are mapped as OpenAIRE results of type publication.
### Mapping Crossref properties into the OpenAIRE Graph
Properties in OpenAIRE research products are set based on the logic described in the following table:
Properties in OpenAIRE results are set based on the logic described in the following table:
| OpenAIRE Research Product field path | Crossref path(s) | Notes |
| OpenAIRE Result field path | Crossref path(s) | Notes |
|----------------------------------------|--------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `id` | `doi` | id in the form `doi_________::md5(doi)` |
| `dateofcollection` | `indexed.datetime` | |
@ -119,7 +119,7 @@ Properties in OpenAIRE research products are set based on the logic described in
| `instance.pid.value` | `doi` | The doi is normalised and lower-cased |
| `instance.publicationdate` | `issued.datetime` or, if not available, `created.datetime` | |
| `instance.refereed` | | set to `peerReviewed` only if `relation.has-review.id` is not empty, `UNKNOWN` otherwise. |
| `instance.type` | `subtype` | mapped using the [OpenAIRE vocabulary for research products typologies](https://api.openaire.eu/vocabularies/dnet:result_typologies) |
| `instance.type` | `subtype` | mapped using the [OpenAIRE vocabulary for result typologies](https://api.openaire.eu/vocabularies/dnet:result_typologies) |
| `instance.url` | `doi` | Full URL of the DOI |
All other fields of the Json schema not mentioned in the table contain empty values.
@ -133,7 +133,7 @@ Possible improvements:
### Map Crossref links to projects/funders
Links to funding available in Crossref are mapped as funding relationships (`ResearchProduct -- isProducedBy -- Project`) applying the following mapping:
Links to funding available in Crossref are mapped as funding relationships (`result -- isProducedBy -- project`) applying the following mapping:
| Funder | Grant code | Link to |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|
@ -166,9 +166,9 @@ The fields we consider from UnpayWall are:
* `best_oa_location`
* `oa_status`
The records of Crossref that intersect by DOI with UnpayWall records are enriched with one additional `instance` with the following properties:
The results of Crossref that intersect by DOI with UnpayWall records are enriched with one additional `instance` with the following properties:
| OpenAIRE Research Product field path | Unpaywall field path | Notes |
| OpenAIRE Result field path | Unpaywall field path | Notes |
|----------------------------------------|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `instance` | | created only if `is_oa` and a `best_oa_location` is available |
| `instance.accessright` | | default value `Open Access`: we do not add instances if UnpayWall says there is no open version |

View File

@ -69,12 +69,12 @@ curl -s "https://www.ebi.ac.uk/europepmc/webservices/rest/MED/33024307/datalinks
```
## Mapping
The table below describes the mapping from the EBI links records to the OpenAIRE Graph Dataset format.
The table below describes the mapping from the EBI links records to the OpenAIRE Graph dump format.
We filter all the target links with pid type **ena**, **pdb** or **uniprot**
For each target we construct a Bioentity with the following mapping
| OpenAIRE Research Product field path | EBI record field xpath | Notes |
| OpenAIRE Result field path | EBI record field xpath | Notes |
|-----------------------------|----------------------------------------------------------|---------------------------------------------------------------|
| `id` | `target/identifier/ID` and `target/identifier/IDScheme` | id in the form `SCHEMA_________::md5(pid)` |
| `pid` | `target/identifier/ID` and `target/identifier/IDScheme` | `classid = classname = schema` |
@ -91,4 +91,4 @@ For each target we construct a Bioentity with the following mapping
### Relation Mapping
| OpenAIRE Relation Semantic and inverse | Source/Target type | Notes |
|----------------------------------------|---------------------|--------------------------------------------------------------------------|
| `IsRelatedTo` | `ResearchProduct/ResearchProduct` | we create relationships between the BioEntity and the pubmed publication |
| `IsRelatedTo` | `result/result` | we create relationships between the BioEntity and the pubmed publication |

View File

@ -14,7 +14,7 @@ Pubmed exposes an entry point FTP with all the updates for each one. [ftp baseli
The table below describes the mapping from the XML baseline records to the OpenAIRE Graph dump format.
| OpenAIRE Research Product field path | PubMed record field xpath | Notes |
| OpenAIRE Result field path | PubMed record field xpath | Notes |
|--------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Publication Mapping** | | |
| `id` | `//PMID` | id in the form `pmid_________::md5(pmid)` |

View File

@ -7,10 +7,10 @@ From this dataset, only the protein records linked to a PubMed publication are e
## Entity Mapping
The table below describes the mapping from the TEXT metadata format to the OpenAIRE Graph Dataset format.
The table below describes the mapping from the TEXT metadata format to the OpenAIRE Graph dump format.
You can check an example of the text metadata [here](https://rest.uniprot.org/uniprotkb/A0A0C5B5G6.txt)
| OpenAIRE Research Product field path | FASTA record field xpath | Notes |
| OpenAIRE Result field path | FASTA record field xpath | Notes |
|------------------------------|--------------------------------------------------------------------------|------------------------------------------------------------------------------------------|
| **BIOEntity Mapping** | | |
| `id` | `LINE Starts with AC` | id in the form `uniprot_____::md5(id)` |

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [research product's instance typology](../data-model/entities/research-product#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -2,24 +2,24 @@
The Deduction process (also known as “bulk tagging”) enriches each record with new information that can be derived from the existing property values.
This process is used to associate research products to community/research initiatives that are part of OpenAIRE.
This process is used to associate results to community/research initiatives that are part of OpenAIRE.
As of November 2022, three procedures are in place to relate a research product to a research initiative, infrastructure (RI) or community (RC) based on:
* subjects: it is possible to specify a list of subjects that are relevant for the RC/RI. Every time one of the subjects is found among the subjects of a research products, the research products is linked to the RC/RI.
* subjects: it is possible to specify a list of subjects that are relevant for the RC/RI. Every time one of the subjects is found among the subjects of a result, the result is linked to the RC/RI.
<p align="center">
<img loading="lazy" alt="Bulktagging Subject" src={require('../../assets/img/enrichment/bulktagging_subject.png').default} width="50%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
* data sources: it is possible to list a set of data sources relevant for the RC/RI. All research products collected from these data sources will be linked to the RC/RI
* data sources: it is possible to list a set of data sources relevant for the RC/RI. All the results collected from these data sources will be linked to the RC/RI
<p align="center">
<img loading="lazy" alt="Bulktagging Data source" src={require('../../assets/img/enrichment/bulktagging_datasource.png').default} width="50%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
When only some research products collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the research product to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the research product. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -5,30 +5,30 @@ relationships and values between the involved entities
As of November 2022, the following procedures are in place:
* Country propagation: updates the property “country” of a research product. This happens when the research product is collected from an institutional datasource or when the datasource hosting the research product is inserted in a whitelist. For all the research products whose hosting datasource verifies one of the conditions above, the country of the organization providing the datasource is added to the country of the research product: e.g. publication collected from an institutional repository maintained by an italian university will be enriched with the property “country = IT”.
* Country propagation: updates the property “country” of a results. This happens when the result is collected from an institutional datasource or when the datasource hosting the result is inserted in a whitelist. For all the results whose hosting datasource verifies one of the conditions above, the country of the organization providing the datasource is added to the country of the result: e.g. publication collected from an institutional repository maintained by an italian university will be enriched with the property “country = IT”.
<p align="center">
<img loading="lazy" alt="Country Propagation" src={require('../../assets/img/enrichment/propagation_country.png').default} width="50%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
* Project propagation: adds a "isProducedBy" relationship (and its inverse) between a Project P and research product R1, if R1 has a strong semantic relationship with another research product R2 and P produces R2: e.g. publication linked to project P “is supplemented by” a dataset D. Dataset D will get the link to project P. The relationships considered for this procedure are “isSupplementedBy” and “isSupplementTo”.
* Project propagation: adds a "isProducedBy" relationship (and its inverse) between a Project P and Result R1, if R1 has a strong semantic relationship with another Result R2 and P produces R2: e.g. publication linked to project P “is supplemented by” a dataset D. Dataset D will get the link to project P. The relationships considered for this procedure are “isSupplementedBy” and “isSupplementTo”.
<p align="center">
<img loading="lazy" alt="Project Propagation" src={require('../../assets/img/enrichment/propagation_resulttoproject.png').default} width="40%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
* Research product to RC/RI through organization propagation. The manager of the RC/RI can specify a set of organizations whose product are relevant for the
* Result to RC/RI through organization propagation. The manager of the RC/RI can specify a set of organizations whose product are relevant for the
community.
Each research product having such a relation of affiliation with at least one organization relevant for the RC/RI will be linked to it.
Each result having such a relation of affiliation with at least one organization relevant for the RC/RI will be linked to it.
<p align="center">
<img loading="lazy" alt="Research product to community through organization propagation" src={require('../../assets/img/enrichment/propagation_resulttocommunitythroughorganization.png').default}
<img loading="lazy" alt="Result to community through organization propagation" src={require('../../assets/img/enrichment/propagation_resulttocommunitythroughorganization.png').default}
width="50%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
* Research product to RC/RI through semantic relation: extends the set of products linked to a RC/RI by exploiting strong semantic relationships between the research products;
e.g. if a research product R1 is associated to the community C and is supplemented by a research product R2 then R2 will be linked to the community. The relationships considered for this procedure are “isSupplementedBy” and “supplements”.
* Result to RC/RI through semantic relation: extends the set of products linked to a RC/RI by exploiting strong semantic relationships between the results;
e.g. if a result R1 is associated to the community C and is supplemented by a result R2 then the result R2 will be linked to the community. The relationships considered for this procedure are “isSupplementedBy” and “supplements”.
<p align="center">
<img loading="lazy" alt="Research product to community through semantic relation propagation" src={require('../../assets/img/enrichment/propagation_resulttocommunitythroughsemrel.png').default} width="40%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
<img loading="lazy" alt="Result to community through semantic relation propagation" src={require('../../assets/img/enrichment/propagation_resulttocommunitythroughsemrel.png').default} width="40%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
* ORCID identifiers to research product through semantic relation. This propagation enriches the research products by adding ORCID identifiers to authors. The added ORCID will be marked as "potential" since they have been inserted through propagation.
The process considers the set of overlapping authors between research products (R1 and R2) linked with a strong semantic relationship (IsSupplementedBy, IsSupplementTo).
* ORCID identifiers to result through semantic relation. This propagation enriches the results by adding ORCID identifiers to authors. The added ORCID will be marked as "potential" since they have been inserted through propagation.
The process considers the set of overlapping authors between results (R1 and R2) linked with a strong semantic relationship (IsSupplementedBy, IsSupplementTo).
For each author A in the overlapping set, if R1 provides the ORCID value for A and R2 does not, then the author A in R2 will be enriched with the information of the ORCID found in R1.
<p align="center">
@ -36,14 +36,14 @@ For each author A in the overlapping set, if R1 provides the ORCID value for A a
</p>
* affiliation to organization through institutional repository. This propagation adds one "hasAuthorInstitution" relationship (and its inverse)
between a research product R and Organization O,
between a Result R and Organization O,
if R was collected from a datasource D with type institutional repository, and D was provided by O.
<p align="center">
<img loading="lazy" alt="Affiliation propagation through institutional repository" src={require('../../assets/img/enrichment/propagation_affiliationistrepo.png').default} width="40%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
* affiliation to organization through semantic relation. This propagation adds one "hasAuthorInstitution" relationship (and its inverse) between a
research product R and an Organization O,
Result R and an Organization O,
if R has an affiliation relation with an organization O1 that is in relation "isChildOf" with O.
<p align="center">

View File

@ -2,9 +2,9 @@
The OpenAIRE Graph is populated by aggregating metadata records from distinct data sources whose content typically overlaps. For example, the collection of article metadata records from publisher' archives (e.g. Frontiers, Elsevier, Copernicus) and from pre-print platforms (e.g. ArXiv.org, UKPubMed, BioarXiv.org). In order to support monitoring of science, the OpenAIRE Graph implements record deduplication and merge strategies, in such a way the scientific production can be consistently statistically represented. Such strategies reflect the following intuition behind OpenAIRE monitoring: "Two metadata records are equivalent when they describe the same research product, hence they feature compatible resource types, have the same title, the same authors, or, alternatively, the same PID". Finally, groups of duplicates can be whitelisted or blacklisted, in order to manually refine the quality of this strategy.
It should be noticed that publication dates do not make a difference, as different versions of the same product can be published at different times; e.g. the pre-print and a published version of a scientific article, which should be counted as one object; abstracts, subjects, and other possible related fields, are not used to strengthen similarity, due to their heterogeneity or absence across different data sources. Moreover, even when two products are indicated as one a new version of the other, the presence of different authors will not bring them into the same group, to avoid unfair distribution of scientific reward.
It should be noticed that publication dates do not make a difference, as different versions of the same product can be published at different times; e.g. the pre-print and a published version of a scientific article, which should be counted as one object; abstracts, subjects, and other possible related fields, are not used to strenghten similarity, due to their heterogeneity or absence across different data sources. Moreover, even when two products are indicated as one a new version of the other, the presence of different authors will not bring them into the same group, to avoid unfair distribution of scientific reward.
Groups of duplicates are finally merged into a new "representative record", having its own id, embedding properties of the merged records and carrying provenance information about the data sources and the relative "instances", i.e. manifestations of the products, together with their resource type, access rights, and publishing date.
Groups of duplicates are finally merged into a new "dedup" record that embeds all properties of the merged records and carries provenance information about the data sources and the relative "instances", i.e. manifestations of the products, together with their resource type, access rights, and publishing date.
## Methodology overview
@ -37,7 +37,7 @@ To further limit the number of comparisons, a sliding window mechanism is used:
### Duplicates grouping (transitive closure)
Once the similarity relations between pairs of records are drawn, the groups of equivalent records are obtained (transitive closure, i.e. “mesh”). From such sets a new **representative record** is obtained, which inherits properties from the merged records and keeps track of their provenance.
Once the similarity relations between pairs of records are drawn, the groups of equivalent records are obtained (transitive closure, i.e. “mesh”). From such sets a new representative object is obtained, which inherits all properties from the merged records and keeps track of their provenance.
### Relation redistribution

View File

@ -2,104 +2,36 @@
sidebar_position: 1
---
# Research products
# Research results
Duplicates among research products are identified among results of the same
type (publications, datasets, software, other research products). If two
duplicate research products are aggregated one as a dataset and one as a
software, for example, they will never be compared and they will never be
identified as duplicates.
OpenAIRE supports different deduplication strategies based on the type of
results.
Duplicates among research results are identified among results of the same type (publications, datasets, software, other research products). If two duplicate results are aggregated one as a dataset and one as a software, for example, they will never be compared and they will never be identified as duplicates.
OpenAIRE supports different deduplication strategies based on the type of results.
The next sections describe how each stage of the deduplication workflow is faced
for research products.
The next sections describe how each stage of the deduplication workflow is faced for research results.
### Candidate identification (clustering)
To match the requirements of limiting the number of comparisons, OpenAIRE
clustering for research products works with two different strategies based on
entity types:
To match the requirements of limiting the number of comparisons, OpenAIRE clustering for research products works with two functions:
* *DOI-based function*: the function generates the DOI when this is provided as part of the record properties;
* *Title-based function*: the function generates a key that depends on (i) number of significant words in the title (normalized, stemming, etc.), (ii) module 10 of the number of characters of such words, and (iii) a string obtained as an alternation of the function prefix(3) and suffix(3) (and vice versa) on the first 3 words (2 words if the title only has 2). For example, the title ``Search for the Standard Model Higgs Boson`` becomes ``search standard model higgs boson`` with two keys key ``5-3-seaardmod`` and ``5-3-rchstadel``.
#### Software
* *Title extraction functions*:
two clustering functions are applied to the title (normalized, stemming, etc.)
* *stats and suffix prefix of words*: the function generates a key that
depends on (i) number of significant words in the title, (ii) module 10 of
the number of characters of such words, and (iii) a
string
obtained as an alternation of the function prefix(3) and suffix(3) (and
vice-versa) on the first 3 words (2 words if the title only has 2). For
example, the title ``Search for the Standard Model Higgs Boson``
becomes the two keys ``5-3-seaardmod`` and ``5-3-rchstadel``
* *n-grams*: the function generates ngrams from the
title. For example, the
title ``Search for the Standard Model Higgs Boson``
becomes the keys ``tan``, ``sta``, ``ode``, ``mod``, ``ear``, ``hig``,
``igg``, ``sea``
* *DOI extraction function*: the function generates the DOI when this is
provided as part of the record properties
* *URL extraction function*: the function generates the hostname part provided
by the URL of the software, if any
#### Publication, Dataset and Other Research Product
* *PID extraction function*: the function generates the PIDs when at least one
is provided as part of the ``pid`` record properties
* *Author and Title extraction function*: the function generates a key that
depends on (i) the number of authors of the product, with a cap of 21
authors (ii) number of significant words in the title (normalized, stemming,
etc.), divided by 10, and (iii) a string obtained as an alternation of the
function prefix(3) and suffix(3) (and vice versa) on the first 3 words (2
words if the title only has 2).
<br />
For example, a product composed by 197 authors and
titled ``Search for the Standard Model Higgs Boson``
becomes the two keys ``21-0-seaardmod`` and ``21-0-rchstadel``
To give an idea, this configuration generates around 77Mi blocks, which we limited to 200 records each (only 15K blocks are affected by the cut), and entails 260Bi matches.
### Duplicates identification (pair-wise comparisons)
Comparisons in a block are performed using a *sliding window* set to 50 records.
The records are sorted lexicographically on the normalized version of their
titles. The 1st record is compared against all the 50 following ones using the
decision tree, then the second, etc.
Local information about matching records is kept and possibly used to prune
unneeded comparisons, for example once it is known that A equals to both B and
C, B will not be compared against C because the A,B,C group will be anyway
discovered by the global transitive closure step later.
<br />
A different decision tree is adopted depending on the type of the entity being
processed.
Similarity relations drawn in this stage will be consequently used to perform
the duplicates grouping.
Comparisons in a block are performed using a *sliding window* set to 50 records. The records are sorted lexicographically on a normalized version of their titles. The 1st record is compared against all the 50 following ones using the decision tree, then the second, etc. for an NlogN complexity.
A different decision tree is adopted depending on the type of the entity being processed.
Similarity relations drawn in this stage will be consequently used to perform the duplicates grouping.
#### Publications
For each pair of publications in a cluster the following strategy (depicted in
the figure below) is applied.
For each pair of publications in a cluster the following strategy (depicted in the figure below) is applied.
The comparison goes through different stages:
1. *trusted pids check*: comparison of the trusted pid lists (in the `pid` field
of the record). If at least 1 pid is equivalent, records match and the
similarity relation is drawn.
2. *instance type check*: comparison of the instance types (indicating the
subtype of the record, i.e. presentation, conference object, etc.). If the
instance types are not compatible then the records does not match. Otherwise,
the comparison proceeds to the next stage
3. *untrusted pids check*: comparison of all the available pids (in the `pid`
and the `alternateid` fields of the record). In every case, no similarity
relation is drawn in this stage. If at least one pid is equivalent, the next
stage will be a *soft check*, otherwise the next stage is a *strong check*.
4. *soft check*: comparison of the record titles with the Levenshtein distance.
If the distance measure is above 0.9 then the similarity relation is drawn.
5. *strong check*: comparison composed by three substages involving the (i)
comparison of the author list sizes and the version of the record to
determine if they are coherent, (ii) comparison of the record titles with the
Levenshtein distance to determine if it is higher than 0.95, (iii) "smart"
comparison of the author lists to check if common authors are more than 60%
in case of titles whose length is greater than 30 chars or more than 90%
otherwise.
1. *trusted pids check*: comparison of the trusted pid lists (in the `pid` field of the record). If at least 1 pid is equivalent, records match and the similarity relation is drawn.
2. *instance type check*: comparison of the instance types (indicating the subtype of the record, i.e. presentation, conference object, etc.). If the instance types are not compatible then the records does not match. Otherwise, the comparison proceeds to the next stage
3. *untrusted pids check*: comparison of all the available pids (in the `pid` and the `alternateid` fields of the record). In every case, no similarity relation is drawn in this stage. If at least one pid is equivalent, the next stage will be a *soft check*, otherwise the next stage is a *strong check*.
4. *soft check*: comparison of the record titles with the Levenshtein distance. If the distance measure is above 0.9 then the similarity relation is drawn.
5. *strong check*: comparison composed by three substages involving the (i) comparison of the author list sizes and the version of the record to determine if they are coherent, (ii) comparison of the record titles with the Levenshtein distance to determine if it is higher than 0.99, (iii) "smart" comparison of the author lists to check if common authors are more than 60%.
<p align="center">
<img loading="lazy" alt="Publications Decision Tree" src={require('../../assets/img/decisiontree-publication.png').default} width="100%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
@ -107,39 +39,12 @@ The comparison goes through different stages:
[//]: # (Link to the image: https://docs.google.com/drawings/d/19SIilTp1vukw6STMZuPMdc0pv0ODYCiOxP7OU3iPWK8/edit?usp=sharing)
#### Datasets and Other types of research products
For each pair of datasets or other types of research products in a cluster the
strategy depicted in the figure below is applied.
The decision tree is almost identical to the publication decision tree, with the
only exception of the *instance type check* stage. Since such type of record
does not have a relatable instance type, the check is not performed and the
decision tree node is skipped.
<p align="center">
<img loading="lazy" alt="Dataset and Other types of research products Decision Tree" src={require('../../assets/img/decisiontree-dataset-orp.png').default} width="90%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
[//]: # (Link to the image: https://docs.google.com/drawings/d/1uBa7Bw2KwBRDUYIfyRr_Keol7UOeyvMNN7MPXYLg4qw/edit?usp=sharing)
#### Software
For each pair of software in a cluster the following strategy (depicted in the
figure below) is applied.
For each pair of software in a cluster the following strategy (depicted in the figure below) is applied.
The comparison goes through different stages:
1. *DOI pids and URLs check*: comparison of the pids of type DOI and URLs in the
records. If at least 1 DOI is equivalent or 1 URL is equivalent, then records
match and the similarity relation is drawn
2. *title check*: comparison of the record titles with Levenshtein distance,
excluding versioning information.
If the distance is below 0.95 then the records does not match. Otherwise, the
comparison proceeds to the next stage
3. *untrusted DOI check*: comparison of all the available DOIs (in the `pid` and
the `alternateid` fields of the record). If at least 1 DOI is equivalent,
records match and the similarity relation is drawn
4. *authors check*: "smart" comparison of the author lists to check if the two
products share all authors
1. *pids check*: comparison of the pids in the records. No similarity relation is drawn in this stage, it is only used to establish the final threshold to be used to compare record titles. If there is at least one common pid, then the next stage is a *soft check*. Otherwise, the next stage is a *strong check*
2. *soft check*: comparison of the record titles with Levenshtein distance. If the measure is above 0.9, then the similarity relation is drawn
3. *strong check*: comparison of the record titles with Levenshtein distance. If the measure is above 0.99, then the similarity relation is drawn
<p align="center">
<img loading="lazy" alt="Software Decision Tree" src={require('../../assets/img/decisiontree-software.png').default} width="85%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
@ -147,87 +52,18 @@ The comparison goes through different stages:
[//]: # (Link to the image: https://docs.google.com/drawings/d/19gd1-GTOEEo6awMObGRkYFhpAlO_38mfbDFFX0HAkuo/edit?usp=sharing)
### Duplicates grouping
#### Datasets and Other types of research products
For each pair of datasets or other types of research products in a cluster the strategy depicted in the figure below is applied.
The decision tree is almost identical to the publication decision tree, with the only exception of the *instance type check* stage. Since such type of record does not have a relatable instance type, the check is not performed and the decision tree node is skipped.
The aim of the final stage is the creation of records that group all the
equivalent entities discovered pairwise by the previous step. This is done in
multiple phases.
<p align="center">
<img loading="lazy" alt="Dataset and Other types of research products Decision Tree" src={require('../../assets/img/decisiontree-dataset-orp.png').default} width="90%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>
</p>
#### Transitive closure
[//]: # (Link to the image: https://docs.google.com/drawings/d/1uBa7Bw2KwBRDUYIfyRr_Keol7UOeyvMNN7MPXYLg4qw/edit?usp=sharing)
As the concluding step of duplicate identification, a transitive closure is
performed against similarity relations to identify complete groups of duplicated
records (cliques). If a group exceeds 200 elements, only the first 200 elements
are included in the group, while the remaining elements are kept ungrouped.
### Duplicates grouping (transitive closure)
#### Selection of the pivot record
The general concept is that the field coming from the record with higher "trust" value is used as reference for the field of the representative record.
Each group of duplicate records needs to be identified in the final graph with
an OpenAIRE identifier, derived from a record of the group known as the _pivot
record_. It is determined after sorting the group of duplicate records by the
following criteria:
1. Records previously chosen as pivot records in the graph's previous
generations.
2. Records with identifiers from a [PID authority](/data-model/pids-and-identifiers#pid-authorities).
3. Publications from CrossRef or datasets from DataCite.
4. Records with an earlier date of acceptance.
5. Records with smaller IDs in lexicographical order.
The first sorting criterion is possible because a state table, called "pivot
history", is maintained across graph generations. It keeps track of which
records were used as pivot records in what graph, guaranteed to retain data for
the last 12 months.
#### Creation of representative records
The representative record, also known as the "dedup record", replaces the group
of deduplicated records in the graph.
##### OpenAIRE identifier of the representative record
The OpenAIRE identifier of the representative record is generated based on the
identifier of the record chosen as the pivot of the group:
- if the pivot record comes from a "PID authority", the identifier of the
representative record is the same, but the "PID Type Prefix" part of the
identifier is modified to append ``_dedup``.<br/>
For example ```doi_________::d5021b53204e4fdeab6ff5d5bc468032``` will
become ```doi_dedup___::d5021b53204e4fdeab6ff5d5bc468032```
- otherwise the "PID Type Prefix" part will be set to the fixed value
``dedup_wf_002``, and the following hash will be calculated as the MD5 hash of
the entire raw id of the pivot record.<br/>
For example ``DansKnawCris::0829b5191605bdbea36d6502b8c1ce1g`` will
become ``dedup_wf_002::345e5d1b80537b0d0e0a49241ae9e516``
##### Content of the representative record
The representative records inherits properties from the records it merges
and tracks their provenance. Whenever possible, it preserves all data from the
merged records, such as the ``instance`` field. In cases where a specific value
must be chosen, the most representative one is selected. For example, for the
"dateofacceptance" field, the earliest value is chosen.
##### Merged and singleton representative record
Changes in metadata content or graph construction may lead to cases where
representative records disappear from the graph:
1. When two or more representative records are merged into one representative
record. Put it other terms this happens when a group of duplicated records
contains multiple records formerly used as pivot record.
2. When a record chosen as a pivot record leaves its group and remains alone.
3. When a record chosen as a pivot record is no longer published by its data
source (deletion of the metadata record).
To address these cases, the pivot history table ensures the visibility of
disappearing representative records for the first two cases. Specifically:
1. In the case of merged representative records, the new representative record
and the ones that would be lost are generated and linked as part of the new
representative record.
2. In the case of a record no longer serving as a pivot, a representative record
is generated and linked only with that record.
This approach ensures that users can access representative records that would
otherwise be lost.
The IDs of the representative records are obtained by appending the prefix ``dedup_`` to the MD5 of the first ID (given their lexicographical ordering). If the group of merged records contains a trusted ID (i.e. the DOI), also the ``doi`` keyword is added to the prefix.

View File

@ -4,14 +4,9 @@ sidebar_position: 1
# Affiliation matching
***Short description:*** The goal of the affiliation matching module is to match affiliation strings (identified in full-text PDFs or in scholarly databases, such as Crossref) with persistent organization identifiers (e.g., ROR identifiers).
Depending on the data source, we currently employ two distinct methodologies:
***Short description:*** The goal of the affiliation matching module is to match affiliations extracted from the pdf and xml documents with organizations from the OpenAIRE organization database.
- The [first](#algorithmic-details-of-the-first-method) method revolves around affiliations extracted from PDF and XML documents, which are subsequently matched with organizations within the OpenAIRE database.
- The [second](#algorithmic-details-of-the-second-method) concerns affiliations retrieved from platforms such as Crossref, PubMed, and Datacite, and are matched to organizations of the ROR database.
## Algorithmic details of the first method
***Algorithmic details:***
*The buckets concept*
@ -44,13 +39,13 @@ The total match strength is calculated in such a way that each consecutive voter
***Parameters:***
* input
* input_document_metadata: [ExtractedDocumentMetadata](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/metadataextraction/ExtractedDocumentMetadata.avdl) avro datastore location. Document metadata is the source of affiliations.
* input_organizations: [Organization](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/importer/Organization.avdl) avro datastore location.
* input_document_to_project: [DocumentToProject](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/importer/DocumentToProject.avdl) avro datastore location with **imported** document-to-project relations. These relations (alongside with inferred document-project and project-organization relations) are used to generate document-organization pairs which are used as a hint for matching affiliations.
* input_inferred_document_to_project: [DocumentToProject](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/referenceextraction/project/DocumentToProject.avdl) avro datastore location with **inferred** document-to-project relations.
* input_project_to_organization: [ProjectToOrganization](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/importer/ProjectToOrganization.avdl) avro datastore location. These relations (alongside with infered document-project and document-project relations) are used to generate document-organization pairs which are used as a hint for matching affiliations
* input_document_metadata: [ExtractedDocumentMetadata](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/metadataextraction/ExtractedDocumentMetadata.avdl) avro datastore location. Document metadata is the source of affiliations.
* input_organizations: [Organization](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/importer/Organization.avdl) avro datastore location.
* input_document_to_project: [DocumentToProject](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/importer/DocumentToProject.avdl) avro datastore location with **imported** document-to-project relations. These relations (alongside with inferred document-project and project-organization relations) are used to generate document-organization pairs which are used as a hint for matching affiliations.
* input_inferred_document_to_project: [DocumentToProject](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/referenceextraction/project/DocumentToProject.avdl) avro datastore location with **inferred** document-to-project relations.
* input_project_to_organization: [ProjectToOrganization](https://github.com/openaire/iis/blob/master/iis-schemas/src/main/avro/eu/dnetlib/iis/importer/ProjectToOrganization.avdl) avro datastore location. These relations (alongside with infered document-project and document-project relations) are used to generate document-organization pairs which are used as a hint for matching affiliations
* output
* [MatchedOrganization](https://github.com/openaire/iis/blob/master/iis-wf/iis-wf-affmatching/src/main/resources/eu/dnetlib/iis/wf/affmatching/model/MatchedOrganization.avdl) avro datastore location with matched publications with organizations.
* [MatchedOrganization](https://github.com/openaire/iis/blob/master/iis-wf/iis-wf-affmatching/src/main/resources/eu/dnetlib/iis/wf/affmatching/model/MatchedOrganization.avdl) avro datastore location with matched publications with organizations.
***Limitations:*** -
@ -60,48 +55,3 @@ Java, Spark
***References:*** -
***Authority:*** ICM &bull; ***License:*** AGPL-3.0 &bull; ***Code:*** [CoAnSys/affiliation-organization-matching](https://github.com/CeON/CoAnSys/tree/master/affiliation-organization-matching)
## Algorithmic details of the second method
*Categorization*
The affiliations' strings are imported and undergo cleaning, tokenization, and removal of stopwords. Similar to the “buckets concept” of the first method, the goal is to split the affiliation strings, as well as the ROR organizations, into coherent groups. To achieve this, data preprocessing has already been conducted on ROR's data, involving the analysis of word frequency ('keywords') within the legal names of ROR's organizations to define specific categories. These categories include universities and institutes, laboratories, hospitals, companies, museums, governments, foundation, and rest organizations. ROR's organizations have subsequently been assigned to these categories based on their legal names. The algorithm employs a similar approach to categorize affiliations into these same groups.
*String Shortening*
The objective is to extract pertinent details from each affiliation string. The algorithm divides the string whenever a comma (,) or semicolon (;) is detected. It then applies specific 'rules' to these segments and retains only those containing relevant keywords. Additionally, it trims down the segments by preserving words in proximity to particular keywords like "university," "institute," "laboratory," or "hospital." As a result, the average string length is reduced from 90 to 35 characters.
*Matching with ROR's Database*
The algorithm checks whether a substring containing a keyword is linked to a legal name or to an alternative name in the organizations listed in the ROR's database. In order to identify the most accurate match, the algorithm employs cosine similarity.. Although alternative methods like Levenshtein Distance or Jaro-Winkler Distance were considered for measuring string similarity, it was concluded that cosine similarity was the most appropriate choice for this specific application.
*Refinement*
If multiple matches are found above the desired similarity thresholds, the algorithm performs another check. It applies cosine similarity between the organizations found in the ROR's database and the original affiliation string. This comparison takes into account additional information present in the original affiliation, such as addresses or city names. The algorithm aims to identify the best fit among the potential matches. Note that the case where two or more different organizations share the same name is also considered.
***Parameters:***
* input
* source of affiliations: JSON Crossref or XML Pubmed or Parquet DataCite files.
* organizations: [dix_acad.pkl](https://github.com/openaire/affro/blob/main/dictionaries/dix_acad.pkl), [dix_mult](https://github.com/openaire/affro/blob/main/dictionaries/dix_mult.pkl), [dix_city](https://github.com/openaire/affro/blob/main/dictionaries/dix_city.pkl), [dix_country](https://github.com/openaire/affro/blob/main/dictionaries/dix_country.pkl) (four pickled dictionaries with keys legalnames and alternativenames of organizations in the ROR database.)
* similarity thresholds: simU for universities, simG for other organizations (default values are simU = 0.64, simG = 0.87).
cument-organization pairs which are used as a hint for matching affiliations
* output
* JSON file with ROR ids of organizations and corresponding similarity scores for each DOI.
***Limitations:*** -
***Environment:***
Python
***References:*** -
***Authority:*** OpenAIRE &bull; ***License:*** AGPL-3.0 &bull; ***Code:*** [AffRo](https://github.com/openaire/affro)

View File

@ -7,12 +7,12 @@ The output of this final step is the final version of the OpenAIRE Graph.
## Filtering
Bibliographic records that do not meet minimal requirements for being part of the OpenAIRE Graph are eliminated during this phase.
Currently, the only criteria applied horizontally to the entire graph aims at excluding research products whose title is not meaningful for citation purposes.
Currently, the only criteria applied horizontally to the entire graph aims at excluding scientific results whose title is not meaningful for citation purposes.
Then, different criteria are applied in the pre-processing of specific sub-collections:
* [Crossref filtering](/graph-production-workflow/aggregation/non-compatible-sources/doiboost#crossref-filtering)
## Country cleaning
This phase is responsible for removing the country information from research products that match specific criteria. The need for this phase is driven by the fact that some datasources, although referred of national pertinence, they contain material that is not always related to the given country.
This phase is responsible for removing the country information from result records that match specific criteria. The need for this phase is driven by the fact that some datasources, although referred of national pertinence, they contain material that is not always related to the given country.

View File

@ -1,6 +1,6 @@
# Graph production workflow
OpenAIRE collects metadata records from more than 70K scholarly communication sources from all over the world, including Open Access institutional repositories, data archives, journals. All the metadata records (i.e. descriptions of research products) are put together in a data lake, together with records from Crossref, Unpaywall, ORCID, ROR, and information about projects provided by national and international funders. Dedicated inference algorithms applied to metadata and to the full-texts of Open Access publications enrich the content of the data lake with links between research products and projects, author affiliations, subject classification, links to entries from domain-specific databases. Duplicated organisations and research products are identified and merged together to obtain an open, trusted, public resource enabling explorations of the scholarly communication landscape like never before.
OpenAIRE collects metadata records from more than 70K scholarly communication sources from all over the world, including Open Access institutional repositories, data archives, journals. All the metadata records (i.e. descriptions of research products) are put together in a data lake, together with records from Crossref, Unpaywall, ORCID, ROR, and information about projects provided by national and international funders. Dedicated inference algorithms applied to metadata and to the full-texts of Open Access publications enrich the content of the data lake with links between research results and projects, author affiliations, subject classification, links to entries from domain-specific databases. Duplicated organisations and results are identified and merged together to obtain an open, trusted, public resource enabling explorations of the scholarly communication landscape like never before.
<p align="center">
<img loading="lazy" alt="Data provision" src={require('../assets/img/architecture.png').default} width="100%" className="img_node_modules-@docusaurus-theme-classic-lib-theme-MDXComponents-Img-styles-module"/>

View File

@ -2,7 +2,7 @@
The final version of the OpenAIRE Graph is indexed on a Solr server that is used by the OpenAIRE portals ([EXPLORE](https://explore.openaire.eu), [CONNECT](https://connect.openaire.eu), [PROVIDE](https://provide.openaire.eu)) and APIs, the latter adopted by several third-party applications and organizations, such as:
* The OpenAIRE Graph APIs and Portals will offer to the EOSC (European Open Science Cloud) an Open Science Resource Catalogue, keeping an up to date map of all research products (publications, datasets, software), services, organizations, projects, funders in Europe and beyond.
* The OpenAIRE Graph APIs and Portals will offer to the EOSC (European Open Science Cloud) an Open Science Resource Catalogue, keeping an up to date map of all research results (publications, datasets, software), services, organizations, projects, funders in Europe and beyond.
* DSpace & EPrints repositories can install the OpenAIRE plugin to expose OpenAIRE compliant metadata records via their OAI-PMH endpoint and offer to researchers the possibility to link their depositions to the funding project, by selecting it from the list of project provided by OpenAIRE.

View File

@ -1,16 +1,16 @@
# Citation-based impact indicators
# Impact indicators
This page summarises all calculated citation-based impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [bipIndicators](../../data-model/entities/other#bipindicators) property (found under the [indicators](../../data-model/entities/research-product#indicators) property of the reseach product).
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [bipIndicators](/data-model/entities/other#bipindicators) property (found under the [indicators](/data-model/entities/result#indicators) property of the result).
It should be noted that the citation-based impact indicators are being calculated on the level of the research output.
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
## Citation Count (CC) <small><span className="bip-indicator-names">&bull; influence_alt</span></small>
***Short description:***
This is the most widely used citation-based impact indicator, which sums all citations received by each article.
Citation count can be viewed as a measure of a publication's overall (citation-based) impact, since it conveys the number of other works that directly
This is the most widely used scientific impact indicator, which sums all citations received by each article.
Citation count can be viewed as a measure of a publication's overall impact, since it conveys the number of other works that directly
drew on it.
***Algorithmic details:***
@ -126,7 +126,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -2,7 +2,7 @@ import DocCardList from '@theme/DocCardList';
# Indicators ingestion
In this step, research products are enriched with Impact and Usage Statistics indicators.
In this step, results are enriched with Impact and Usage Statistics indicators.
The former are provided by [BIP!](https://bip.imsi.athenarc.gr/) while the latter are computed by OpenAIRE's [UsageCounts service](https://usagecounts.openaire.eu/).
<DocCardList />

View File

@ -11,13 +11,13 @@ across the various datasources.
This phase is therefore responsible to compensate for such inconsistencies and performs
a global grouping of every record available in the graph:
- entities are grouped by [`id`](../data-model/entities/research-product#id)
- entities are grouped by [`id`](../data-model/entities/result#id)
- relations are grouped by [`source`, `target`, `reltype`](../data-model/relationships/relationship-object)
This ensures that the same record, possibly assigned to different types by different
mappings, appears only once in the graph and under a single typing. In case of clashing
identifiers, the properties are merged (including the provenance information), considering
the following precedence order for the research product typing:
identifiers, the properties are merged (including the provencance information), considering
the following precedence order for the result typing:
```
publication > dataset > software > other

View File

@ -4,11 +4,11 @@ sidebar_position: 7
# Relevant publications
Open Science services are open and transparent and survive thanks to your active support and to the visibility and reward they gather. If you use one of the [OpenAIRE Graph Datasets](https://doi.org/10.5281/zenodo.3516917) for your research, please provide a proper citation following the recommendation that you find on the dataset's Zenodo page or as provided below.
Open Science services are open and transparent and survive thanks to your active support and to the visibility and reward they gather. If you use one of the [OpenAIRE Graph dumps](https://doi.org/10.5281/zenodo.3516917) for your research, please provide a proper citation following the recommendation that you find on the dump's Zenodo page or as provided below.
:::note How to cite
Manghi P., Atzori C., Bardi A., Baglioni M., Schirrwagen J., Dimitropoulos H., La Bruzzo S., Foufoulas I., Mannocci A., Horst M., Czerniak A., Iatropoulou K., Kokogiannaki A., De Bonis M., Artini M., Lempesis A., Ioannidis A., Manola N., Principe P., Vergoulis T., Chatzopoulos S., Pierrakos D. (2022). "OpenAIRE Research Graph Dataset", *Dataset*, Zenodo. [doi:10.5281/zenodo.3516917](https://doi.org/10.5281/zenodo.3516917) ([BibTex](/bibtex/OpenAIRE_Research_Graph_dump.bib))
Manghi P., Atzori C., Bardi A., Baglioni M., Schirrwagen J., Dimitropoulos H., La Bruzzo S., Foufoulas I., Mannocci A., Horst M., Czerniak A., Iatropoulou K., Kokogiannaki A., De Bonis M., Artini M., Lempesis A., Ioannidis A., Manola N., Principe P., Vergoulis T., Chatzopoulos S., Pierrakos D. (2022). "OpenAIRE Research Graph Dump", *Dataset*, Zenodo. [doi:10.5281/zenodo.3516917](https://doi.org/10.5281/zenodo.3516917) ([BibTex](/bibtex/OpenAIRE_Research_Graph_dump.bib))
:::
## Other relevant research products

View File

@ -1,8 +1,8 @@
// @ts-check
// Note: type annotations allow type checking and IDEs autocompletion
const lightCodeTheme = require('prism-react-renderer/themes/github');
const darkCodeTheme = require('prism-react-renderer/themes/dracula');
const lightCodeTheme = require('prism-react-renderer').themes.github;
const darkCodeTheme = require('prism-react-renderer').themes.dracula;
const math = require('remark-math');
const katex = require('rehype-katex');
const dotenv = require('dotenv');
@ -141,6 +141,9 @@ const config = {
prism: {
theme: lightCodeTheme,
darkTheme: darkCodeTheme,
additionalLanguages: [
'json'
]
},
matomo: {
matomoUrl: 'https://analytics.openaire.eu/',

20329
package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@ -14,22 +14,22 @@
"write-heading-ids": "docusaurus write-heading-ids"
},
"dependencies": {
"@docusaurus/core": "^2.2.0",
"@docusaurus/preset-classic": "^2.2.0",
"@easyops-cn/docusaurus-search-local": "^0.33.6",
"@mdx-js/react": "^1.6.22",
"@docusaurus/core": "3.0.0",
"@docusaurus/preset-classic": "3.0.0",
"@easyops-cn/docusaurus-search-local": "^0.40.1",
"@mdx-js/react": "^3.0.0",
"clsx": "^1.2.1",
"docusaurus-plugin-matomo": "^0.0.6",
"docusaurus-plugin-matomo": "^0.0.8",
"dotenv": "^16.0.3",
"hast-util-is-element": "^1.1.0",
"prism-react-renderer": "^1.3.5",
"react": "^17.0.2",
"react-dom": "^17.0.2",
"rehype-katex": "^5.0.0",
"remark-math": "^3.0.1"
"prism-react-renderer": "^2.1.0",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"rehype-katex": "^7.0.0",
"remark-math": "^6.0.0"
},
"devDependencies": {
"@docusaurus/module-type-aliases": "^2.2.0"
"@docusaurus/module-type-aliases": "3.0.0"
},
"browserslist": {
"production": [
@ -44,6 +44,6 @@
]
},
"engines": {
"node": ">=16.14"
"node": ">=18.0"
}
}

View File

@ -31,7 +31,7 @@ const sidebars = {
description: 'The main entities of the OpenAIRE Graph are listed below.'
},
items: [
{ type: 'doc', id: 'data-model/entities/research-product' },
{ type: 'doc', id: 'data-model/entities/result' },
{ type: 'doc', id: 'data-model/entities/data-source' },
{ type: 'doc', id: 'data-model/entities/organization' },
{ type: 'doc', id: 'data-model/entities/project' },
@ -63,7 +63,7 @@ const sidebars = {
label: "Search API",
link: { type: 'doc', id: 'apis/search-api/search-api' },
items: [
{ type: 'doc', id: 'apis/search-api/research-products' },
{ type: 'doc', id: 'apis/search-api/results' },
{ type: 'doc', id: 'apis/search-api/projects' },
{ type: 'doc', id: 'apis/search-api/response-metadata-format' },
]
@ -212,11 +212,6 @@ const sidebars = {
label: "Helpdesk",
href: "https://graph.openaire.eu/support"
},
{
type: "link",
label: "User forum",
href: "https://openaire.flarum.cloud/"
}
]
};

View File

@ -72,3 +72,60 @@
height: var(--ifm-navbar-height);
}
/* custom css classes to embed publications pages in an iframel; adjusts look and feel with docusaurus-data-x query-string parameters */
html[data-embed-publications='true'] .navbar {
display: none;
}
html[data-embed-publications='true'] .theme-doc-toc-desktop {
display: none;
}
html[data-embed-publications='true'] .theme-doc-sidebar-container {
display: none;
}
html[data-embed-publications='true'] .theme-doc-breadcrumbs{
display: none;
}
html[data-embed-publications='true'] .theme-doc-version-badge {
display: none;
}
/* intended to remove header in publications */
html[data-embed-publications='true'] .theme-doc-markdown h1 {
display: none;
}
html[data-embed-publications='true'] .theme-doc-toc-mobile {
display: none;
}
html[data-embed-publications='true'] .pagination-nav {
display: none;
}
html[data-embed-publications='true'] .footer {
display: none;
}
html[data-embed-publications='true'] .container > .row > .col--3 {
display: none;
}
html[data-embed-publications='true'] {
--ifm-background-color: #fff;
}
html[data-embed-publications='true'] .theme-admonition-note {
background-color: #f5f5f5;
}
html[data-embed-publications='true'] .theme-doc-markdown {
position: absolute;
left: 0px;
right: 0px;
top: 0px;
}

View File

@ -21,7 +21,7 @@
Vergoulis, Thanasis and
Chatzopoulos, Serafeim and
Pierrakos, Dimitris},
title = {OpenAIRE Graph Dataset},
title = {OpenAIRE Research Graph Dump},
month = dec,
year = 2022,
note = {{A new version of this dataset is published every 6

View File

@ -24,12 +24,12 @@ This section will document all notable changes for each graph version.
#### Added
- [Impact indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](../data-model/entities/result#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -18,8 +18,8 @@ As of November 2022, three procedures are in place to relate a research product
</p>
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -1,6 +1,6 @@
# Impact indicators
This page summarises all calculated impact indicators, which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property which is part of the [indicators](../../data-model/entities/result#indicators) property of the result.
This page summarises all calculated impact indicators, which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property which is part of the [indicators](/data-model/entities/result#indicators) property of the result.
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
@ -125,7 +125,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -8,7 +8,7 @@ sidebar_position: 1
# Extended Result
It is a subclass of [Result](../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.

View File

@ -41,14 +41,14 @@ _Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **y
#### Added
- [Impact & Usage indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](../data-model/entities/result#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -18,8 +18,8 @@ As of November 2022, three procedures are in place to relate a research product
</p>
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -1,6 +1,6 @@
# Impact indicators
This page summarises all calculated impact indicators, which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property which is part of the [indicators](../../data-model/entities/result#indicators) property of the result.
This page summarises all calculated impact indicators, which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property which is part of the [indicators](/data-model/entities/result#indicators) property of the result.
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
@ -125,7 +125,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -8,7 +8,7 @@ sidebar_position: 1
# Extended Result
It is a subclass of [Result](../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.

View File

@ -27,8 +27,8 @@ _Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dump release: **n
- Revised SDG classification: improved coverage (+600K classified DOIs)
- General increase of the funded scientific outputs, thanks to the full text mining scanning new OpenAccess publications
- Integrated contents from
- [EMBL-EBIs Protein Data Bank in Europe](./graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](./graph-production-workflow//aggregation/non-compatible-sources/uniprot)
- [EMBL-EBIs Protein Data Bank in Europe](/graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](/graph-production-workflow//aggregation/non-compatible-sources/uniprot)
#### Changed
@ -60,14 +60,14 @@ _Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **y
#### Added
- [Impact & Usage indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

View File

@ -8,7 +8,7 @@ sidebar_position: 1
# Extended Result
It is a subclass of [Result](../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](../data-model/entities/result#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -18,8 +18,8 @@ As of November 2022, three procedures are in place to relate a research product
</p>
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -1,6 +1,6 @@
# Impact indicators
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the It is a subclass of [Result](../../data-model/entities/result) extended with information regarding p[impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](../../data-model/entities/result#indicators)rojects (and funders) property of the result).
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](/data-model/entities/result#indicators) property of the result).
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
@ -126,7 +126,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -38,8 +38,8 @@ _Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dump release: **n
- Revised SDG classification: improved coverage (+600K classified DOIs)
- General increase of the funded scientific outputs, thanks to the full text mining scanning new OpenAccess publications
- Integrated contents from
- [EMBL-EBIs Protein Data Bank in Europe](./graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](./graph-production-workflow//aggregation/non-compatible-sources/uniprot)
- [EMBL-EBIs Protein Data Bank in Europe](/graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](/graph-production-workflow//aggregation/non-compatible-sources/uniprot)
#### Changed
@ -71,14 +71,14 @@ _Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **y
#### Added
- [Impact & Usage indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

View File

@ -11,12 +11,12 @@ The latest version of the JSON schema can be found on the [Downloads](../downloa
The figure above, presents the graph's data model.
Its main entities are described in brief below:
* [Results](./entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](./entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](./entities/organization) correspond to companies or research institutions involved in projects
* [Results](/data-model/entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](/data-model/entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](/data-model/entities/organization) correspond to companies or research institutions involved in projects,
responsible for operating data sources or consisting the affiliations of Product creators.
* [Projects](./entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](./entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
* [Projects](/data-model/entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](/data-model/entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
:::note Further reading

View File

@ -6,32 +6,32 @@ Note: the labels used to specify the semantic of the relationships are (for the
| # | Source entity type | Target entity type | Relation name / inverse | Provenance |
|:--:|:--------------------------------------:|:--------------------------------------:|:----------------------------------------------------------:|:-----------------------------------------------:|
| 1 | [Project](../../data-model/entities/project) | [Result](../../data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](../../data-model/entities/project) | [Organization](../../data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](../../data-model/entities/project) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](../../data-model/entities/result) | [Organization](../../data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](../../data-model/entities/result) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](../../data-model/entities/organization) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](../../data-model/entities/organization) | [Organization](../../data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](../../data-model/entities/data-source) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](../../data-model/entities/data-source) | [Organization](../../data-model/entities/organization) | isProvidedBy / provides | Harvested |
| 1 | [Project](/data-model/entities/project) | [Result](/data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](/data-model/entities/project) | [Organization](/data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](/data-model/entities/project) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](/data-model/entities/result) | [Organization](/data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](/data-model/entities/result) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](/data-model/entities/organization) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](/data-model/entities/organization) | [Organization](/data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](/data-model/entities/data-source) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](/data-model/entities/data-source) | [Organization](/data-model/entities/organization) | isProvidedBy / provides | Harvested |

View File

@ -8,7 +8,7 @@ sidebar_position: 1
# Extended Result
It is a subclass of [Result](../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](../data-model/entities/result#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -18,8 +18,8 @@ As of November 2022, three procedures are in place to relate a research product
</p>
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -1,6 +1,6 @@
# Impact indicators
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the It is a subclass of [Result](../../data-model/entities/result) extended with information regarding p[impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](../../data-model/entities/result#indicators)rojects (and funders) property of the result).
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](/data-model/entities/result#indicators) property of the result).
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
@ -126,7 +126,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -53,8 +53,8 @@ _Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dump release: **n
- Revised SDG classification: improved coverage (+600K classified DOIs)
- General increase of the funded scientific outputs, thanks to the full text mining scanning new OpenAccess publications
- Integrated contents from
- [EMBL-EBIs Protein Data Bank in Europe](./graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](./graph-production-workflow//aggregation/non-compatible-sources/uniprot)
- [EMBL-EBIs Protein Data Bank in Europe](/graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](/graph-production-workflow//aggregation/non-compatible-sources/uniprot)
#### Changed
@ -86,14 +86,14 @@ _Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **y
#### Added
- [Impact & Usage indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

View File

@ -11,12 +11,12 @@ The latest version of the JSON schema can be found on the [Downloads](../downloa
The figure above, presents the graph's data model.
Its main entities are described in brief below:
* [Results](./entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](./entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](./entities/organization) correspond to companies or research institutions involved in projects
* [Results](/data-model/entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](/data-model/entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](/data-model/entities/organization) correspond to companies or research institutions involved in projects,
responsible for operating data sources or consisting the affiliations of Product creators.
* [Projects](./entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](./entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
* [Projects](/data-model/entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](/data-model/entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
:::note Further reading

View File

@ -6,32 +6,32 @@ Note: the labels used to specify the semantic of the relationships are (for the
| # | Source entity type | Target entity type | Relation name / inverse | Provenance |
|:--:|:--------------------------------------:|:--------------------------------------:|:----------------------------------------------------------:|:-----------------------------------------------:|
| 1 | [Project](../../data-model/entities/project) | [Result](../../data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](../../data-model/entities/project) | [Organization](../../data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](../../data-model/entities/project) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](../../data-model/entities/result) | [Organization](../../data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](../../data-model/entities/result) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](../../data-model/entities/organization) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](../../data-model/entities/organization) | [Organization](../../data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](../../data-model/entities/data-source) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](../../data-model/entities/data-source) | [Organization](../../data-model/entities/organization) | isProvidedBy / provides | Harvested |
| 1 | [Project](/data-model/entities/project) | [Result](/data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](/data-model/entities/project) | [Organization](/data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](/data-model/entities/project) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](/data-model/entities/result) | [Organization](/data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](/data-model/entities/result) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](/data-model/entities/organization) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](/data-model/entities/organization) | [Organization](/data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](/data-model/entities/data-source) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](/data-model/entities/data-source) | [Organization](/data-model/entities/organization) | isProvidedBy / provides | Harvested |

View File

@ -8,7 +8,7 @@ sidebar_position: 1
# Extended Result
It is a subclass of [Result](../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](../data-model/entities/result#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -18,8 +18,8 @@ As of November 2022, three procedures are in place to relate a research product
</p>
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -1,6 +1,6 @@
# Impact indicators
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the It is a subclass of [Result](../../data-model/entities/result) extended with information regarding p[impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](../../data-model/entities/result#indicators)rojects (and funders) property of the result).
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](/data-model/entities/result#indicators) property of the result).
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
@ -126,7 +126,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -72,8 +72,8 @@ _Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dump release: **n
- Revised SDG classification: improved coverage (+600K classified DOIs)
- General increase of the funded scientific outputs, thanks to the full text mining scanning new OpenAccess publications
- Integrated contents from
- [EMBL-EBIs Protein Data Bank in Europe](./graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](./graph-production-workflow//aggregation/non-compatible-sources/uniprot)
- [EMBL-EBIs Protein Data Bank in Europe](/graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](/graph-production-workflow//aggregation/non-compatible-sources/uniprot)
#### Changed
@ -105,14 +105,14 @@ _Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **y
#### Added
- [Impact & Usage indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

View File

@ -11,12 +11,12 @@ The latest version of the JSON schema can be found on the [Downloads](../downloa
The figure above, presents the graph's data model.
Its main entities are described in brief below:
* [Results](./entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](./entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](./entities/organization) correspond to companies or research institutions involved in projects
* [Results](/data-model/entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](/data-model/entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](/data-model/entities/organization) correspond to companies or research institutions involved in projects,
responsible for operating data sources or consisting the affiliations of Product creators.
* [Projects](./entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](./entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
* [Projects](/data-model/entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](/data-model/entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
:::note Further reading

View File

@ -6,32 +6,32 @@ Note: the labels used to specify the semantic of the relationships are (for the
| # | Source entity type | Target entity type | Relation name / inverse | Provenance |
|:--:|:--------------------------------------:|:--------------------------------------:|:----------------------------------------------------------:|:-----------------------------------------------:|
| 1 | [Project](../../data-model/entities/project) | [Result](../../data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](../../data-model/entities/project) | [Organization](../../data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](../../data-model/entities/project) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](../../data-model/entities/result) | [Organization](../../data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](../../data-model/entities/result) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](../../data-model/entities/organization) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](../../data-model/entities/organization) | [Organization](../../data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](../../data-model/entities/data-source) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](../../data-model/entities/data-source) | [Organization](../../data-model/entities/organization) | isProvidedBy / provides | Harvested |
| 1 | [Project](/data-model/entities/project) | [Result](/data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](/data-model/entities/project) | [Organization](/data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](/data-model/entities/project) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](/data-model/entities/result) | [Organization](/data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](/data-model/entities/result) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](/data-model/entities/organization) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](/data-model/entities/organization) | [Organization](/data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](/data-model/entities/data-source) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](/data-model/entities/data-source) | [Organization](/data-model/entities/organization) | isProvidedBy / provides | Harvested |

View File

@ -8,7 +8,7 @@ sidebar_position: 1
# Extended Result
It is a subclass of [Result](../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](../data-model/entities/result#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -18,8 +18,8 @@ As of November 2022, three procedures are in place to relate a research product
</p>
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -1,6 +1,6 @@
# Impact indicators
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the It is a subclass of [Result](../../data-model/entities/result) extended with information regarding p[impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](../../data-model/entities/result#indicators)rojects (and funders) property of the result).
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [impactMeasures](/data-model/entities/other#impactmeasures) property (found under the [indicators](/data-model/entities/result#indicators) property of the result).
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
@ -126,7 +126,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -24,8 +24,8 @@ _Start Date: 2023-07-26 &bull; Release Date: 2023-08-16 &bull; Dump release: **y
#### Changed
- [Relationship data model](./data-model/relationships/relationship-object): flattened properties source, sourceType, target, targetType
- BIP! indicators are now serialised as an array; see the updated model [here](./data-model/entities/other#bipindicators)
- [Relationship data model](/data-model/relationships/relationship-object): flattened properties source, sourceType, target, targetType
- BIP! indicators are now serialised as an array; see the updated model [here](/data-model/entities/other#bipindicators)
- Crossref dump from June 2023
- ORCID works without a DOI from June 2023
- Usage counts from June 2023
@ -88,8 +88,8 @@ _Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dump release: **n
- Revised SDG classification: improved coverage (+600K classified DOIs)
- General increase of the funded scientific outputs, thanks to the full text mining scanning new OpenAccess publications
- Integrated contents from
- [EMBL-EBIs Protein Data Bank in Europe](./graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](./graph-production-workflow//aggregation/non-compatible-sources/uniprot)
- [EMBL-EBIs Protein Data Bank in Europe](/graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](/graph-production-workflow//aggregation/non-compatible-sources/uniprot)
#### Changed
@ -121,14 +121,14 @@ _Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **y
#### Added
- [Impact & Usage indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

View File

@ -11,12 +11,12 @@ The latest version of the JSON schema can be found on the [Downloads](../downloa
The figure above, presents the graph's data model.
Its main entities are described in brief below:
* [Results](./entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](./entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](./entities/organization) correspond to companies or research institutions involved in projects
* [Results](/data-model/entities/result) represent the outcomes (or products) of research activities.
* [Data Sources](/data-model/entities/data-source) are the sources from which the metadata of graph objects are collected.
* [Organizations](/data-model/entities/organization) correspond to companies or research institutions involved in projects,
responsible for operating data sources or consisting the affiliations of Product creators.
* [Projects](./entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](./entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
* [Projects](/data-model/entities/project) are research project grants funded by a Funding Stream of a Funder.
* [Communities](/data-model/entities/community) are groups of people with a common research intent (e.g. research infrastructures, university alliances).
:::note Further reading

View File

@ -6,32 +6,32 @@ Note: the labels used to specify the semantic of the relationships are (for the
| # | Source entity type | Target entity type | Relation name / inverse | Provenance |
|:--:|:--------------------------------------:|:--------------------------------------:|:----------------------------------------------------------:|:-----------------------------------------------:|
| 1 | [Project](../../data-model/entities/project) | [Result](../../data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](../../data-model/entities/project) | [Organization](../../data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](../../data-model/entities/project) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](../../data-model/entities/result) | [Result](../../data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](../../data-model/entities/result) | [Organization](../../data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](../../data-model/entities/result) | [Data source](../../data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](../../data-model/entities/result) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](../../data-model/entities/organization) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](../../data-model/entities/organization) | [Organization](../../data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](../../data-model/entities/data-source) | [Community](../../data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](../../data-model/entities/data-source) | [Organization](../../data-model/entities/organization) | isProvidedBy / provides | Harvested |
| 1 | [Project](/data-model/entities/project) | [Result](/data-model/entities/result) | produces / isProducedBy | Harvested, Inferred by OpenAIRE, Linked by user |
| 2 | [Project](/data-model/entities/project) | [Organization](/data-model/entities/organization) | hasParticipant / isParticipant | Harvested |
| 3 | [Project](/data-model/entities/project) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 4 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsAmongTopNSimilarDocuments / HasAmongTopNSimilarDocuments | Inferred by OpenAIRE |
| 5 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSupplementTo / IsSupplementedBy | Harvested |
| 6 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 7 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPartOf / HasPart | Harvested |
| 8 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDocumentedBy / Documents | Harvested |
| 9 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsObsoletedBy / Obsoletes | Harvested |
| 10 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsSourceOf / IsDerivedFrom | Harvested |
| 11 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCompiledBy / Compiles | Harvested |
| 12 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsRequiredBy / Requires | Harvested |
| 13 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsCitedBy / Cites | Harvested, Inferred by OpenAIRE |
| 14 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReferencedBy / References | Harvested |
| 15 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsReviewedBy / Reviews | Harvested |
| 16 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsOriginalFormOf / IsVariantFormOf | Harvested |
| 17 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsVersionOf / HasVersion | Harvested |
| 18 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsIdenticalTo / IsIdenticalTo | Harvested |
| 19 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsPreviousVersionOf / IsNewVersionOf | Harvested |
| 20 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsContinuedBy / Continues | Harvested |
| 21 | [Result](/data-model/entities/result) | [Result](/data-model/entities/result) | IsDescribedBy / Describes | Harvested |
| 22 | [Result](/data-model/entities/result) | [Organization](/data-model/entities/organization) | hasAuthorInstitution / isAuthorInstitutionOf | Harvested, Inferred by OpenAIRE |
| 23 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isHostedBy / hosts | Harvested, Inferred by OpenAIRE |
| 24 | [Result](/data-model/entities/result) | [Data source](/data-model/entities/data-source) | isProvidedBy / provides | Harvested |
| 25 | [Result](/data-model/entities/result) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Harvested, Inferred by OpenAIRE, Linked by user |
| 26 | [Organization](/data-model/entities/organization) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 27 | [Organization](/data-model/entities/organization) | [Organization](/data-model/entities/organization) | IsChildOf / IsParentOf | Linked by user |
| 28 | [Data source](/data-model/entities/data-source) | [Community](/data-model/entities/community) | IsRelatedTo / IsRelatedTo | Linked by user |
| 29 | [Data source](/data-model/entities/data-source) | [Organization](/data-model/entities/organization) | isProvidedBy / provides | Harvested |

View File

@ -8,7 +8,7 @@ sidebar_position: 1
# Extended Result
It is a subclass of [Result](../../data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.
It is a subclass of [Result](/data-model/entities/result) extended with information regarding projects (and funders), research communities/infrastructure and related data sources.

View File

@ -27,7 +27,7 @@ A vocabulary is a data structure that defines a list of terms, and for each term
[...]
```
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](../data-model/entities/result#instance).
Each vocabulary is typically used to control and harmonise the values available in a specific field characterising the bibliographic records. The example above provides a preview of the vocabulary used to clean the [result's instance typology](/data-model/entities/result#instance).
The content of the vocabularies can be accessed on [api.openaire.eu/vocabularies](https://api.openaire.eu/vocabularies/).

View File

@ -18,8 +18,8 @@ As of November 2022, three procedures are in place to relate a research product
</p>
When only some results collected from a datasource are relevant for the RC/RI, it is possible to specify a set of selection constraints (SC) that have to be verified before linking the result to the
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F={title, author, contributor, description, orcid}</strong>,
while the set of condition can be among <strong>V={contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase}</strong>, and the value is free text.
community. The selection constraint has the form <strong>SC = S1 or S2 or ... or Sn</strong>. The generic Si has the form <strong>Si = s<sub>i1</sub> and s<sub>i2</sub> and ...and s<sub>in</sub></strong> and each s<sub>ij</sub> is a condition on a specific field of the result. The set of fields that can be specified is <strong>F=\{title, author, contributor, description, orcid\}</strong>,
while the set of condition can be among <strong>V=\{contains, equals, not_contains, not_equals, contains_ignorecase, equals_ignorecase, not_contains_ignorecase, not_equal_ignorecase\}</strong>, and the value is free text.
A possible selection criteria can be: “All the products whose contributor contains DARIAH “
<p align="center">

View File

@ -1,6 +1,6 @@
# Impact indicators
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [bipIndicators](/data-model/entities/other#bipindicators) property (found under the [indicators](../../data-model/entities/result#indicators) property of the result).
This page summarises all calculated impact indicators, provided by [BIP!](https://bip.imsi.athenarc.gr/), which are included in the [bipIndicators](/data-model/entities/other#bipindicators) property (found under the [indicators](/data-model/entities/result#indicators) property of the result).
It should be noted that the impact indicators are being calculated on the level of the research output.
Below we explain their main intuition, the way they are calculated, and their most important limitations, in an attempt help avoiding common pitfalls and misuses.
@ -126,7 +126,7 @@ Also, since some indicators require the publication year for their calculation,
***Environment:*** PySpark
***References:***
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^{th} international conference on data mining workshops (pp. 373-380). IEEE.
* Ghosh, R., Kuo, T. T., Hsu, C. N., Lin, S. D., & Lerman, K. (2011, December). Time-aware ranking in dynamic citation networks. In 2011 ieee 11^\{th\} international conference on data mining workshops (pp. 373-380). IEEE.
***Authority:*** ATHENA RC &bull; ***License:*** GPL-2.0 &bull; ***Code:*** [BIP! Ranker](https://github.com/athenarc/Bip-Ranker)

View File

@ -40,8 +40,8 @@ _Start Date: 2023-07-26 &bull; Release Date: 2023-08-16 &bull; Dump release: **y
#### Changed
- [Relationship data model](./data-model/relationships/relationship-object): flattened properties source, sourceType, target, targetType
- BIP! indicators are now serialised as an array; see the updated model [here](./data-model/entities/other#bipindicators)
- [Relationship data model](/data-model/relationships/relationship-object): flattened properties source, sourceType, target, targetType
- BIP! indicators are now serialised as an array; see the updated model [here](/data-model/entities/other#bipindicators)
- Crossref dump from June 2023
- ORCID works without a DOI from June 2023
- Usage counts from June 2023
@ -104,8 +104,8 @@ _Start Date: 2023-02-13 &bull; Release Date: 2023-03-01 &bull; Dump release: **n
- Revised SDG classification: improved coverage (+600K classified DOIs)
- General increase of the funded scientific outputs, thanks to the full text mining scanning new OpenAccess publications
- Integrated contents from
- [EMBL-EBIs Protein Data Bank in Europe](./graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](./graph-production-workflow//aggregation/non-compatible-sources/uniprot)
- [EMBL-EBIs Protein Data Bank in Europe](/graph-production-workflow/aggregation/non-compatible-sources/ebi)
- [UniProtKB/Swiss-Prot](/graph-production-workflow//aggregation/non-compatible-sources/uniprot)
#### Changed
@ -137,14 +137,14 @@ _Start Date: 2022-12-19 &bull; Release Date: 2022-12-28 &bull; Dump release: **y
#### Added
- [Impact & Usage indicators](./data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](./downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](./data-model/relationships/relationship-types)
- [Impact & Usage indicators](/data-model/entities/result#indicators) at the level of the Result
- [Beginner's kit](/downloads/beginners-kit) in the Downloads section
- New relationship types were introduced; see the complete list [here](/data-model/relationships/relationship-types)
#### Changed
- FOS and SDGs were removed from the [result subjects](./data-model/entities/result#subjects)
- Measures were removed from the [result instance](./data-model/entities/result#instance)
- FOS and SDGs were removed from the [result subjects](/data-model/entities/result#subjects)
- Measures were removed from the [result instance](/data-model/entities/result#instance)
- Updated DOIBoost to include publications from Crossref and the works from ORCID with a DOI until November 2022
- Added ORCID works without a DOI from November 2022

Some files were not shown because too many files have changed in this diff Show More