Add search, filtering, sorting & paging using the Graph API

This commit is contained in:
Serafeim Chatzopoulos 2024-07-10 21:22:13 +03:00
parent 0b092111e2
commit 8b12f7bf04
9 changed files with 341 additions and 26 deletions

View File

@ -1,6 +1,3 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
# Get single entities
This is a guide on how to retrieve detailed information on a single entity using the OpenAIRE Graph API.
@ -16,8 +13,8 @@ Currently, the Graph API supports the following entity types:
You can retrieve the data of a single entity by providing the entity's OpenAIRE identifier (id) in the corresponding endpoint.
The OpenAIRE id is the primary key of an entity in the OpenAIRE Graph.
:::info
Note that if you want to retrieve multiple entities based on their OpenAIRE ids, you can use the [search endpoints and filter](./search-entities/filter-search-results) by the `id` field using `OR`.
:::note
Note that if you want to retrieve multiple entities based on their OpenAIRE ids, you can use the [search endpoints and filter](./search-entities/filter-search-results#or-operator) by the `id` field using `OR`.
:::
## Response
@ -28,7 +25,7 @@ The response of the Graph API is a [Research product](../../data-model/entities/
In order to retrieve the research product with OpenAIRE id: `doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108`,
you have to perform the following API call:
[`https://openaire-api.athenarc.gr/researchProducts/doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108`](https://openaire-api.athenarc.gr/researchProducts/doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108)
[https://openaire-api.athenarc.gr/researchProducts/doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108](https://openaire-api.athenarc.gr/researchProducts/doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108)
This will return all the data of the research product with the provided identifier:

View File

@ -0,0 +1,33 @@
# Making requests
or using code:
<Tabs>
<TabItem value="research-product-curl" label="Curl">
```bash
curl -X GET "https://openaire-api.athenarc.gr/researchProducts/doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108" -H "accept: application/json"
```
</TabItem>
<TabItem value="research-product-python" label="Python">
```python
import requests
url = "https://openaire-api.athenarc.gr/researchProducts/doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108"
headers = {
"accept": "application/json"
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
print(response.json())
else:
print(f"Request failed with status code {response.status_code}")
```
</TabItem>
</Tabs>

View File

@ -25,6 +25,6 @@ Please use the following links to learn more about the Graph API:
- [Get single entities](./get-single-entities) - Retrieve detailed information on a single entity.
- [Search entities](./search-entities/overview) - Retrieve a list of entities based on specific search criteria.
- [Filter search results](./search-entities/filter-search-results) - Filter search results based on specific criteria.
- [Sort search results](./search-entities/sort-search-results) - Sort search results based on specific criteria.
- [Pagination](./search-entities/pagination) - Retrieve a subset of search results.
- [Sort search results](./search-entities/sorting-and-paging#sorting) - Sort search results based on specific criteria.
- [Pagination](./search-entities/sorting-and-paging#paging) - Retrieve a subset of search results.
- [Making requests](./making-requests) - Learn how to make requests with different programming languages.

View File

@ -1,5 +1,215 @@
# Filter search results
:::info warning
To be completed soon, for the next beta release.
Filters can be used to narrow down the search results based on specific criteria.
Filters are provided as query parameters in the request URL (see [here](./overview#endpoints) for the available search entpoints).
Multiple filters can be provided in a single request; they should be formatted as follows:
`param1=value1&param2=value2&...&paramN=valueN`.
:::note
Filters are combined using the logical `AND` operator.
If a filter is provided multiple times, its values are combined using the logical `OR` operator.
For more information on how to use logical operators when searching and filtering, see [Using logical operators](#using-logical-operators).
:::
Examples:
- Get all research products that contain the word "covid", sorted by popularity in descending order:
[https://openaire-api.athenarc.gr/researchProducts?search=covid&sortBy=popularity DESC](https://openaire-api.athenarc.gr/researchProducts?search=covid&sortBy=popularity%20DESC)
- Get all publications that are published after 2019-01-01:
[https://openaire-api.athenarc.gr/researchProducts?type=publication&fromPublicationDate=2019-01-01](https://openaire-api.athenarc.gr//researchProducts?type=publication&fromPublicationDate=2019-01-01)
- Get the organization with the ROR id `https://ror.org/0576by029`:
[https://openaire-api.athenarc.gr/organizations?pid=https://ror.org/0576by029](https://openaire-api.athenarc.gr/organizations?pid=https://ror.org/0576by029)
## Available parameters
This section provides an overview of the available parameters for each entity type.
### Research products
The following query parameters are available for research products:
| **Parameter** | **Description** |
|-------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **search** | Search in the content of the research product. |
| **mainTitle** | Search in the research product's main title. |
| **description** | Search in the research product's description. |
| **id** | The OpenAIRE id of the research product. |
| **pid** | The persistent identifier of the research product. |
| **originalId** | The identifier of the record at the original sources. |
| **type** | The type of the research product. One of `publication`, `dataset`, `software`, or `other` |
| **fromPublicationDate** | Gets the research products whose publication date is greater than or equal to the given date. A date formatted as `YYYY-MM-DD` |
| **toPublicationDate** | Gets the research products whose publication date is less than or equal to the given date. A date formatted as `YYYY-MM-DD` |
| **subjects** | List of subjects associated to the research product. |
| **countryCode** | The country code for the country associated with the research product. |
| **authorFullName** | The full name of the authors involved in producing this research product. |
| **authorOrcid** | The ORCiD of the authors involved in producing this research product. |
| **publisher** | The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource.
| **bestOpenAccessRightLabel** | The best open access rights among the research product's instances. One of `OPEN SOURCE`, `OPEN`, `EMBARGO`, `RESTRICTED`, `CLOSED`, `UNKNOWN` |
| **influenceClass** | Citation-based indicator that reflects the overall impact of a research product. Please, choose a class among `C1`, `C2`, `C3`, `C4`, or `C5` for top 0.01%, top 0.1%, top 1%, top 10%, and average in terms of influence respectively. |
| **impulseClass** | Citation-based indicator that reflects the initial momentum of a research product directly after its publication. Please, choose a class among `C1`, `C2`, `C3`, `C4`, or `C5` for top 0.01%, top 0.1%, top 1%, top 10%, and average in terms of impulse respectively
| **popularityClass** | Citation-based indicator that reflects current impact or attention of a research product. Please, choose a class among `C1`, `C2`, `C3`, `C4`, or `C5` for top 0.01%, top 0.1%, top 1%, top 10%, and average in terms of popularity respectively.
| **citationCountClass** | Citation-based indicator that reflects the overall impact of a research product by summing all its citations. Please, choose a class among `C1`, `C2`, `C3`, `C4`, or `C5` for top 0.01%, top 0.1%, top 1%, top 10%, and average in terms of citation count respectively.
| **instanceType** `[Only for publications]` | Retrieve publications of the given instance type. Check [here](http://api.openaire.eu/vocabularies/dnet:publication_resource) for all possible instance type values. |
| **sdg** `[Only for publications]` | Retrieves publications classified with the respective Sustainable Development Goal number. Integer in the range [1, 17] |
| **fos** `[Only for publications]` | Retrieves publications classified with a given Field of Science (FOS). A FOS classification identifier (see [here](https://explore.openaire.eu/assets/common-assets/vocabulary/fos.json) for details). |
| **isPeerReviewed** `[Only for publications]` | Indicates whether the publications are peerReviewed or not. (Boolean) |
| **isInDiamondJournal** `[Only for publications]` | Indicates whether the publication was published in a diamond journal or not. (Boolean) |
| **isPubliclyFunded** `[Only for publications]` | Indicates whether the publication was publicly funded or not. (Boolean) |
| **isGreen** `[Only for publications]` | Indicates whether the publication was published following the green open access model. (Boolean) |
| **openAccessColor** `[Only for publications]` | Specifies the Open Access color of the publication. One of `bronze`, `gold`, or `hybrid` |
| **relOrganizationId** | Retrieve research products connected to the organization (with OpenAIRE id). |
| **relCommunityId** | Retrieve research products connected to the community (with OpenAIRE id). |
| **relProjectId** | Retrieve research products connected to the project (with OpenAIRE id). |
| **relProjectCode** | Retrieve research products connected to the project with code. |
| **hasProjectRel** | Retrieve research products that are connected to a project. (Boolean) |
| **relProjectFundingShortName**| Retrieve research products connected to a project that has a funder with the given short name. |
| **relProjectFundingStreamId** | Retrieve research products connected to a project that has the given funding identifier. |
| **relHostingDataSourceId** | Retrieve research products hosted by the data source (with OpenAIRE id). |
| **relCollectedFromDatasourceId**| Retrieve research products collected from the data source (with OpenAIRE id). |
| **debugQuery** | Retrieve debug information for the search query. (Boolean) |
| **page** | Page number of the results. (Integer) |
| **pageSize** | Number of results per page. Integer in the range [1, 100] |
| **sortBy** | The field to set the sorting order of the results. Should be provided in the format `fieldname ASC\|DESC`, where fieldname is one of `relevance`, `publicationDate`, `dateOfCollection`, `influence`, `popularity`, `citationCount`, `impulse`. Multiple sorting parameters should be comma-separated. |
### Organizations
The following query parameters are available for organizations:
| **Parameter** | **Description** |
|----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|**search** | Search in the content of the organization. |
|**legalName** |The legal name of the organization. |
|**legalShortName** |The legal name of the organization in short form. |
|**id** |The OpenAIRE id of the organization. |
|**pid** |The persistent identifier of the organization. |
|**countryCode** |The country code of the organization. |
|**relCommunityId** |Retrieve organizations connected to the community (with OpenAIRE id). |
|**relCollectedFromDatasourceId**|Retrieve organizations collected from the data source (with OpenAIRE id). |
|**debugQuery** |Retrieve debug information for the search query. |
|**page** |Page number of the results. |
|**pageSize** |Number of results per page. |
|**sortBy** |The field to set the sorting order of the results. Should be provided in the format `fieldname ASC\|DESC` - organizations can be only sorted by the `relevance`.|
### Data sources
The following query parameters are available for data sources:
| **Parameter** | **Description** |
|----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|**search** | Search in the content of the data source. |
|**officialName** |The official name of the data source. |
|**englishName** |The English name of the data source. |
|**legalShortName** |The legal name of the organization in short form. |
|**id** |The OpenAIRE id of the data source. |
|**pid** |The persistent identifier of the data source. |
|**subjects** |List of subjects associated to the datasource. |
|**dataSourceTypeName** |The data source type; see all possible values <a href='https://api.openaire.eu/vocabularies/dnet:datasource_typologies' target='_blank'>here</a> . |
|**contentTypes** |Types of content in the data source, as defined by OpenDOAR. |
|**relOrganizationId** |Retrieve data sources connected to the organization (with OpenAIRE id). |
|**relCommunityId** |Retrieve data sources connected to the community (with OpenAIRE id). |
|**relCollectedFromDatasourceId**|Retrieve data sources collected from the data source (with OpenAIRE id). |
|**debugQuery** |Retrieve debug information for the search query. |
|**page** |Page number of the results. |
|**pageSize** |Number of results per page. |
|**sortBy** |The field to set the sorting order of the results. Should be provided in the format `Fieldname ASC\|DESC` - data sources can be only sorted by the `relevance`.|
### Projects
The following query parameters are available for projects:
| **Parameter** | **Description** |
|----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|**search** | Search in the content of the projects. |
|**title** |Search in the project's title. |
|**keywords** |The project's keywords. |
|**id** |The OpenAIRE id of the project. |
|**code** |The grant agreement (GA) code of the project. |
|**acronym** |Project's acronym. |
|**callIdentifier** |The identifier of the research call. |
|**fundingShortName** |The short name of the funder. |
|**fundingStreamId** |The identifier of the funding stream. |
|**fromStartDate** |Gets the projects with start date greater than or equal to the given date. Please provide a date formatted as `YYYY-MM-DD`. |
|**toStartDate** |Gets the projects with start date less than or equal to the given date. Please provide a date formatted as `YYYY-MM-DD`. |
|**fromEndDate** |Gets the projects with end date greater than or equal to the given date. Please provide a date formatted as `YYYY-MM-DD`. |
|**toEndDate** |Gets the projects with end date less than or equal to the given date. Please provide a date formatted as `YYYY-MM-DD`. |
|**relOrganizationName** |The name or short name of the related organization. |
|**relOrganizationId** |The organization identifier of the related organization. |
|**relCommunityId** |Retrieve projects connected to the community (with OpenAIRE id). |
|**relOrganizationCountryCode** |The country code of the related organizations. |
|**relCollectedFromDatasourceId**|Retrieve projects collected from the data source (with OpenAIRE id). |
|**debugQuery** |Retrieve debug information for the search query. |
|**page** |Page number of the results. |
|**pageSize** |Number of results per page. |
|**sortBy** |The field to set the sorting order of the results. Should be provided in the format `fieldname ASC\|DESC`, where fieldname is one of `relevance`, `startDate`, `endDate`. Multiple sorting parameters should be comma-separated.|
## Using logical operators
The API supports the use of logical operators `AND`, `OR`, and `NOT` to refine your search queries.
These operators help you combine or exclude one or more values for a specific filter.
### `AND` operator
Use the `AND` operator to retrieve results that include all specified values. This narrows your search.
Examples:
- Get research products that contain both "climate" and "change":
[https://openaire-api.athenarc.gr/researchProducts?search=climate AND change](https://openaire-api.athenarc.gr/researchProducts?search=climate%20AND%20change)
- Get research products that are classified with both Fields of Study (FOS) "03 medical and health sciences" and "0502 economics and business":
[https://openaire-api.athenarc.gr/researchProducts?fos="03 medical and health sciences" AND "0502 economics and business"](https://openaire-api.athenarc.gr/researchProducts?fos=%2203%20medical%20and%20health%20sciences%22%20AND%20%220502%20economics%20and%20business%22)
### `OR` operator
Use the `OR` operator to retrieve results that include any of the specified terms. This broadens your search.
The same functionality can be achieved by providing multiple times the same query parameter or using a comma to separate the values.
Examples:
- Get research products with the OpenAIRE ids `doi_dedup___::2b3cb7130c506d1c3a05e9160b2c4108` or `pmid_dedup__::1591ebf0e0698ed4a99455ff2ba4adc0`:
[https://openaire-api.athenarc.gr/researchProducts?id=r3730f562f9e::539da48b3796663b17e6166bb966e5b1 OR pmid_dedup__::1591ebf0e0698ed4a99455ff2ba4adc0](https://openaire-api.athenarc.gr/researchProducts?id=r3730f562f9e::539da48b3796663b17e6166bb966e5b1%20OR%20pmid_dedup__::1591ebf0e0698ed4a99455ff2ba4adc0)
- Get projects that are connected to organizations in the US or Greece:
[https://openaire-api.athenarc.gr/projects?relOrganizationCountryCode=US OR GR](https://openaire-api.athenarc.gr/projects?relOrganizationCountryCode=US%20OR%20GR)
or by using the same query parameter multiple times: [https://openaire-api.athenarc.gr/projects?relOrganizationCountryCode=US&relOrganizationCountryCode=GR](https://openaire-api.athenarc.gr/projects?relOrganizationCountryCode=US&relOrganizationCountryCode=GR)
or just using comma: [https://openaire-api.athenarc.gr/projects?relOrganizationCountryCode=US,GR](https://openaire-api.athenarc.gr/projects?relOrganizationCountryCode=US,GR)
### `NOT` operator
Use the `NOT` operator to exclude specific terms from your search results. This refines your search by filtering out unwanted results.
Examples:
- Get research products that contain "semantic" but not "web":
[https://openaire-api.athenarc.gr/researchProducts?search=semantic NOT web](https://openaire-api.athenarc.gr/researchProducts?search=semantic%20NOT%20web)
- Get all data sources that are not journals:
[https://openaire-api.athenarc.gr/dataSources?dataSourceTypeName=NOT Journal](https://openaire-api.athenarc.gr/dataSources?dataSourceTypeName=NOT%20Journal)
:::note
All the above operators can be combined, along with parentheses, and quotes to create more complex queries.
For example, to get research products that contain the phrase "semantic web" but not "ontology" or "linked data":
[https://openaire-api.athenarc.gr/researchProducts?search="semantic web" AND NOT (ontology OR "linked data")](https://openaire-api.athenarc.gr/researchProducts?search=%22semantic%20web%22%20AND%20NOT%20(ontology%20OR%20%22linked%20data%22))
:::

View File

@ -1,5 +1,44 @@
# Search entities
:::info warning
To be completed soon, for the next beta release.
:::
This is a guide on how to search for specific entities using the OpenAIRE Graph API.
## Endpoints
Currently, the Graph API supports the following entity types:
* Research products - endpoint: [`GET /researchProducts`](https://openaire-api.athenarc.gr/researchProducts)
* Organizations - endpoint: [`GET /organizations`](https://openaire-api.athenarc.gr/organizations)
* Data sources - endpoint: [`GET /dataSources`](https://openaire-api.athenarc.gr/dataSources)
* Projects - endpoint: [`GET /projects`](https://openaire-api.athenarc.gr/projects)
Each of these endpoints can be used to list all entities of the corresponding type.
Listing such entities can be more useful when using the [filtering](./filter-search-results),
[sorting](./sorting-and-paging.md#sorting), and [pagination](./sorting-and-paging.md#paging) capabilities of the Graph API.
## Response
The response of the aforementioned endpoints is an object of the following type:
```json
{
header: {
numFound: 36818386,
maxScore: 1,
queryTime: 21,
page: 1,
pageSize: 10
},
results: [
...
]
}
```
It contains a `header` object with the following fields:
- `numFound`: the total number of entities found
- `maxScore`: the maximum relevance score of the search results
- `queryTime`: the time in milliseconds that the search took
- `page`: the current page of the search results
- `pageSize`: the number of entities per page
Finally, the `results` field contains an array of entities of the corresponding type (i.e., [Research product](../../../data-model/entities/research-product), [Organization](../../../data-model/entities/organization), [Data Source](../../../data-model/entities/data-source), or [Project](../../../data-model/entities/project)).

View File

@ -1,5 +0,0 @@
# Pagination
:::info warning
To be completed soon, for the next beta release.
:::

View File

@ -1,5 +0,0 @@
# Sort search results
:::info warning
To be completed soon, for the next beta release.
:::

View File

@ -0,0 +1,46 @@
# Sorting and paging
The OpenAIRE Graph API allows you to sort and page through the results of your search queries.
This enables you to retrieve the most relevant results and manage large result sets more effectively.
# Sorting
Sorting based on specific fields, helps to retrieve data in the preferred order.
Sorting is achieved using the `sortBy` parameter, which specifies the field and the direction (ascending or descending) for sorting.
* `sortBy`: Defines the field and the sort direction. The format should be `fieldname sortDirection`, where the `sortDirection` can be either `ASC` for ascending order or `DESC` for descending order.
The field names that can be used for sorting are specific to each entity type and can be found in the `sortBy` field values of the [available paremeters](../search-entities/filter-search-results#available-parameters).
Note that the default sorting is based on the `relevance` score of the search results.
Examples:
- Get research products published after 2020-01-01 and sort them by the publication date in descending order:
[https://openaire-api.athenarc.gr/researchProducts?fromPublicationDate=2020-01-01&sortBy=publicationDate DESC](https://openaire-api.athenarc.gr/researchProducts?fromPublicationDate=2020-01-01&sortBy=publicationDate%20DESC)
- Get research products with the keyword "COVID-19" and sort them by their (citation-based) popularity:
[https://openaire-api.athenarc.gr/researchProducts?search=COVID-19&sortBy=popularity DESC](https://openaire-api.athenarc.gr/researchProducts?search=COVID-19&sortBy=popularity%20DESC)
Note that you can combine multiple sorting conditions by separating them with a comma.
Example:
- Get research products with the keyword "COVID-19" and sort them by their publication date in ascending order and then by their popularity in descending order:
[https://openaire-api.athenarc.gr/researchProducts?search=COVID-19&sortBy=publicationDate ASC,popularity DESC](https://openaire-api.athenarc.gr/researchProducts?search=COVID-19&sortBy=publicationDate%20ASC,popularity%20DESC)
# Paging
The OpenAIRE Graph API supports paging through the use of `page` and `pageSize` parameters, enabling you to specify which part of the result set to retrieve and how many results per page.
* `page`: Specifies the page number of the results you want to retrieve. Page numbering starts from 1.
* `pageSize`: Defines the number of results to be returned per page. This helps limit the amount of data returned in a single request, making it easier to process.
Example:
- Get the top 10 most influential research products that contain the phrase "knowledge graphs":
[https://openaire-api.athenarc.gr/researchProducts?search="knowledge graphs"&page=1&pageSize=10&sortBy=influence DESC](https://openaire-api.athenarc.gr/researchProducts?search=%22knowledge%20graphs%22&page=1&pageSize=10&sortBy=influence%20DESC)

View File

@ -70,10 +70,10 @@ const sidebars = {
link: { type: 'doc', id: 'apis/graph-api/search-entities/overview' },
items: [
{ type: 'doc', id: 'apis/graph-api/search-entities/filter-search-results' },
{ type: 'doc', id: 'apis/graph-api/search-entities/sort-search-results' },
{ type: 'doc', id: 'apis/graph-api/search-entities/pagination' },
{ type: 'doc', id: 'apis/graph-api/search-entities/sorting-and-paging' },
]
}
},
{ type: 'doc', id: 'apis/graph-api/making-requests' },
]
},
{