added EBI mapping

This commit is contained in:
Sandro La Bruzzo 2022-11-08 15:58:21 +01:00
parent f05888e637
commit b007a67a3c
2 changed files with 21 additions and 14 deletions

View File

@ -69,23 +69,11 @@ The table below describes the mapping from the XML baseline records to the OpenA
### Relation Mapping ### Relation Mapping
<<<<<<< HEAD
| OpenAIRE Relation Semantic and inverse | Datacite record JSON path | Source/Tartget type | #Notes | | OpenAIRE Relation Semantic and inverse | Datacite record JSON path | Source/Tartget type | #Notes |
|-------------------------------------------|-------------------------------|-------------------------------|---------| |-------------------------------------------|-------------------------------|-------------------------------|---------|
| `isProducedBy` |`attributes\fundingReferences` | `Result/Project`| we must identifi if match this pattern `(info:eu-repo/grantagreement/ec/h2020/)(\d{6})(.*)`| | `isProducedBy` |`attributes\fundingReferences` | `Result/Project`| we must identifi if match this pattern `(info:eu-repo/grantagreement/ec/h2020/)(\d{6})(.*)`|
| `IsProvidedBy` | | `Result/DataSource` | Datasource is always Datacite| | `IsProvidedBy` | | `Result/DataSource` | Datasource is always Datacite|
| `IsHostedBy` | `\attributes\relationships\client\id` | `Result/DataSource` |we defined a curated map clientId/Datasource if we found a match we create an _hostedBy Relation_ | | `IsHostedBy` | `\attributes\relationships\client\id` | `Result/DataSource` |we defined a curated map clientId/Datasource if we found a match we create an _hostedBy Relation_ |
| | `\attribute\relatedIdentifiers` | result/result | we create relationships whenever the pid of the target is resolved on the Research Graph | | | `\attribute\relatedIdentifiers` | result/result | we create relationships whenever the pid of the target is resolved on the Research Graph |
=======
| OpenAIRE Relation Semantic and inverse | Datacite record JSON path | Source/Tartget type | #Notes |
|----------------------------------------|---------------------------------------|----------------------|---------------------------------------------------------------------------------------------------|
| `isProducedBy` | `attributes\fundingReferences` | `Result/Project` | we must identifi if match this pattern `(info:eu-repo/grantagreement/ec/h2020/)(\d{6})(.*)` |
| `IsProvidedBy` | | `Result/DataSource` | Datasource is always Datacite |
| `IsHostedBy` | `\attributes\relationships\client\id` | `Result/DataSource` | we defined a curated map clientId/Datasource if we found a match we create an _hostedBy Relation_ |
### Relation Resolution
>>>>>>> 92baad5acb3ecfb774510b48fee6aeeba92738df

View File

@ -402,7 +402,26 @@ curl -s "https://www.ebi.ac.uk/europepmc/webservices/rest/MED/33024307/datalinks
## Mapping ## Mapping
The table below describes the mapping from the EBI links records to the OpenAIRE Graph dump format. The table below describes the mapping from the EBI links records to the OpenAIRE Graph dump format.
We filter all the target links with pid type **ena**, **pdb** or **uniprot**
For each target we construct a Bioentity with the following mapping
| *OpenAIRE Result field path* | PubMed record field xpath | Notes | | *OpenAIRE Result field path* | EBI record field xpath | Notes |
|--------------------------------|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------| |--------------------------------|--------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `id` | `target/identifier/ID` and `target/identifier/IDScheme` | id in the form `SCHEMA_________::md5(pid)`|
| `pid` | `target/identifier/ID` and `target/identifier/IDScheme` | `classid = classname = schema`|
| `publicationdate` | `target/PublicationDate` | clean and normalize the format of the date to be `YYYY-mm-dd` |
| `maintitle` | `target/Title` | |
| **Instance Mapping** | | |
| `instance.type` | | `Bioentity` |
|`type` | | `Dataset` |
| `instance.pid` |`target/identifier/ID` and `target/identifier/IDScheme` | `classid = classname = schema` |
| `instance.url` | `target/identifier/IDURL` | Copy the value as it is |
|
| `instance.publicationdate` | `//PubmedPubDate` | clean and normalize the format of the date to be YYYY-mm-dd
### Relation Mapping
| OpenAIRE Relation Semantic and inverse | Datacite record JSON path | Source/Tartget type | #Notes |
|-------------------------------------------|-------------------------------|-------------------------------|---------|
| `IsRelatedTo` | | result/result | we create relationships between the BioEntity and the pubmed publication |