Merge pull request 'Enrichment support' (#8) from enrichment into master

Reviewed-on: #8
This commit is contained in:
Claudio Atzori 2022-01-12 17:40:43 +01:00
commit 11ea9e46b1
4 changed files with 35 additions and 23 deletions

View File

@ -2,25 +2,26 @@
## Changelog
| **Version** | **Changes** | **Readiness** |
|-------------|---|---|
| 2.9.24 | [Dump model]</br>change the names of the classes to be able to automatically create the json schema with specific descriptions | beta |
| 2.9.23 | [Graph model]<br>Added Instance.measures field, allowing to maintain the association between them and the individual result instance</br>[Dump model]</br>added json schemas | beta |
| 2.8.22 | [Graph model]<br>minor: added serializable to the Measures model class</br>[Dump model]</br>added dedicated BestAccessRight class, used at the result level instead of AccessRight | production |
| 2.8.21 | [Graph model]<br>added the following relation terms Describes/IsDescribedBy, IsMetadataFor/IsMetadataOf, HasAssociationWith/HasAssociationWith, IsRequiredBy/Requires. All these are used in combination with the relation subRelType "relationship" | production |
| 2.8.20 | [Graph model]<br>added constants declaring the values used for hierarchical relationships among the organizations IsParentOf / IsChildOf | production |
| 2.7.18-19 | [Dump model]<br>include validation info in relations<br>[Graph model]<br>added constants declaring vocabulary names for relation fields | production |
| 2.7.17 | [Dump model]<br>aligned the graph dump schema to mirror the changes in the model<br>1. Added openaccessroute at the level of the instance inside the AccessRight element;<br>2. Added pid and the alternate identifiers at the level of the instance;<br>3. Added the bipFinder measures | production |
| 2.7.16 | [Graph model]<br>Updated the casing of the following terms (`relation.relClass`):<br>1. `isRelatedTo -> IsRelatedTo`<br>Added the following `relClass` terms:<br>1. `IsAmongTopNSimilarDocuments`<br>2. `HasAmongTopNSimilarDocuments` | production |
| 2.7.15 | 1. added support for delegated authorities<br>2. fixed regex for DOI cleaning | production |
| 2.7.14 | [Graph model]<br>Relation types are now inspired by the Datacite definitions https://schema.datacite.org/meta/kernel-4.4/doc/DataCite-MetadataKernel_v4.4.pdf <br>The changes involve the values stored in `relation.subRelType` and `relation.relClass`:<br>Updated the casing of the following terms (`relation.relClass`):<br>1. `isSupplementTo -> IsSupplementTo` / `isSupplementedBy -> IsSupplementedBy`<br>2. `isPartOf -> IsPartOf` / `hasPart -> HasPart`<br>3. `cites -> Cites` / `isCitedBy -> IsCitedBy`<br>4. `reviews -> Reviews` / `isReviewedBy -> IsReviewedBy`<br>Added the following terms [`subRelType: relClass / relClass (inverse)`]:<br>1. `relationship: References / IsReferencedBy`<br>2. `relationship: IsIdenticalTo`<br>3. `relationship: IsContinuedBy / Continues`<br>4. `relationship: IsDocumentedBy / Documents`<br>5. `relationship: Documents / IsDocumentedBy`<br>6. `relationship: IsCompiledBy / Compiles`<br>7. `version: IsPreviousVersionOf / IsNewVersionOf`<br>8. `version: IsSourceOf / IsDerivedFrom`<br>9. `version: IsVariantFormOf / IsOriginalFormOf`<br>10. `version: IsObsoletedBy / Obsoletes`<br>11. `version: IsVersionOf / HasVersion` | production |
| 2.6.14 | [Scholexplorer]<br>1. Added model classes for Scholexplorer, package `eu.dnetlib.dhp.schema.sx` | production |
| 2.6.13 | 1. `Result.mergeFrom` handles field `dateOfAcceptance` | production |
| 2.5.12 | 1. delegating the date parsing to https://github.com/sisyphsu/dateparser | production |
| 2.5.[11-9] | 1. support for more date formats<br>2. enable the possibility to extend the date formats used to parse `Relation.validationDate` | production |
| 2.4.8 | 1. added constant for ORCID datasource name | production |
| 2.4.7 | refactoring | production |
| 2.3.6 | [Aggregation]<br>1. introduced MetadataStoreManager (MdSM) model classes| production |
| 2.2.5 | [Graph model]<br>1. introduced fields `Instance.pid` and `Instance.alternateIdentifier`<br>2. `LicenseComparator` renamed as `AccessRightComparator`<br>3. introduced `AccessRight` model class defining the `OpenAccessRoute` field to keep track of the OpenAccess color at the `Instance` level<br>4. `ExternalReference` cleanup (removed description, added alternateLabel(s))<br>5. added several ModelConstants<br>[Aggregation]<br>7. introduced MDStore record model classes<br>8. Introduced ORCID specific model classes | production |
| 2.2.4 | 1. ORCID specific model classes backported in the version used in PROD<br>2. added constant for dnet:externalReference_typologies<br>3. added constant for ORCID datasource name<br>4. `Result.mergeFrom` handles field `dateOfAcceptance` | production |
| **Version** | **Changes** | **Readiness** |
|--------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---|
| 2.10.24 | [Graph model]</br>added utility method and constants for checking weather is an OafEntity represents an enrichment. | beta |
| 2.9.24 | [Dump model]</br>change the names of the classes to be able to automatically create the json schema with specific descriptions | beta |
| 2.9.23 | [Graph model]<br>Added Instance.measures field, allowing to maintain the association between them and the individual result instance</br>[Dump model]</br>added json schemas | beta |
| 2.8.22 | [Graph model]<br>minor: added serializable to the Measures model class</br>[Dump model]</br>added dedicated BestAccessRight class, used at the result level instead of AccessRight | production |
| 2.8.21 | [Graph model]<br>added the following relation terms Describes/IsDescribedBy, IsMetadataFor/IsMetadataOf, HasAssociationWith/HasAssociationWith, IsRequiredBy/Requires. All these are used in combination with the relation subRelType "relationship" | production |
| 2.8.20 | [Graph model]<br>added constants declaring the values used for hierarchical relationships among the organizations IsParentOf / IsChildOf | production |
| 2.7.18-19 | [Dump model]<br>include validation info in relations<br>[Graph model]<br>added constants declaring vocabulary names for relation fields | production |
| 2.7.17 | [Dump model]<br>aligned the graph dump schema to mirror the changes in the model<br>1. Added openaccessroute at the level of the instance inside the AccessRight element;<br>2. Added pid and the alternate identifiers at the level of the instance;<br>3. Added the bipFinder measures | production |
| 2.7.16 | [Graph model]<br>Updated the casing of the following terms (`relation.relClass`):<br>1. `isRelatedTo -> IsRelatedTo`<br>Added the following `relClass` terms:<br>1. `IsAmongTopNSimilarDocuments`<br>2. `HasAmongTopNSimilarDocuments` | production |
| 2.7.15 | 1. added support for delegated authorities<br>2. fixed regex for DOI cleaning | production |
| 2.7.14 | [Graph model]<br>Relation types are now inspired by the Datacite definitions https://schema.datacite.org/meta/kernel-4.4/doc/DataCite-MetadataKernel_v4.4.pdf <br>The changes involve the values stored in `relation.subRelType` and `relation.relClass`:<br>Updated the casing of the following terms (`relation.relClass`):<br>1. `isSupplementTo -> IsSupplementTo` / `isSupplementedBy -> IsSupplementedBy`<br>2. `isPartOf -> IsPartOf` / `hasPart -> HasPart`<br>3. `cites -> Cites` / `isCitedBy -> IsCitedBy`<br>4. `reviews -> Reviews` / `isReviewedBy -> IsReviewedBy`<br>Added the following terms [`subRelType: relClass / relClass (inverse)`]:<br>1. `relationship: References / IsReferencedBy`<br>2. `relationship: IsIdenticalTo`<br>3. `relationship: IsContinuedBy / Continues`<br>4. `relationship: IsDocumentedBy / Documents`<br>5. `relationship: Documents / IsDocumentedBy`<br>6. `relationship: IsCompiledBy / Compiles`<br>7. `version: IsPreviousVersionOf / IsNewVersionOf`<br>8. `version: IsSourceOf / IsDerivedFrom`<br>9. `version: IsVariantFormOf / IsOriginalFormOf`<br>10. `version: IsObsoletedBy / Obsoletes`<br>11. `version: IsVersionOf / HasVersion` | production |
| 2.6.14 | [Scholexplorer]<br>1. Added model classes for Scholexplorer, package `eu.dnetlib.dhp.schema.sx` | production |
| 2.6.13 | 1. `Result.mergeFrom` handles field `dateOfAcceptance` | production |
| 2.5.12 | 1. delegating the date parsing to https://github.com/sisyphsu/dateparser | production |
| 2.5.[11-9] | 1. support for more date formats<br>2. enable the possibility to extend the date formats used to parse `Relation.validationDate` | production |
| 2.4.8 | 1. added constant for ORCID datasource name | production |
| 2.4.7 | refactoring | production |
| 2.3.6 | [Aggregation]<br>1. introduced MetadataStoreManager (MdSM) model classes | production |
| 2.2.5 | [Graph model]<br>1. introduced fields `Instance.pid` and `Instance.alternateIdentifier`<br>2. `LicenseComparator` renamed as `AccessRightComparator`<br>3. introduced `AccessRight` model class defining the `OpenAccessRoute` field to keep track of the OpenAccess color at the `Instance` level<br>4. `ExternalReference` cleanup (removed description, added alternateLabel(s))<br>5. added several ModelConstants<br>[Aggregation]<br>7. introduced MDStore record model classes<br>8. Introduced ORCID specific model classes | production |
| 2.2.4 | 1. ORCID specific model classes backported in the version used in PROD<br>2. added constant for dnet:externalReference_typologies<br>3. added constant for ORCID datasource name<br>4. `Result.mergeFrom` handles field `dateOfAcceptance` | production |

View File

@ -5,7 +5,7 @@
<groupId>eu.dnetlib.dhp</groupId>
<artifactId>dhp-schemas</artifactId>
<packaging>jar</packaging>
<version>2.9.25-SNAPSHOT</version>
<version>2.10.24-SNAPSHOT</version>
<licenses>
<license>
@ -32,7 +32,7 @@
<connection>scm:git:gitea@code-repo.d4science.org:D-Net/dhp-schemas.git</connection>
<developerConnection>scm:git:gitea@code-repo.d4science.org:D-Net/dhp-schemas.git</developerConnection>
<url>https://code-repo.d4science.org/D-Net/dhp-schemas/</url>
<tag>dhp-schemas-2.9.24</tag>
<tag>dhp-schemas-2.10.24</tag>
</scm>
<description>This module contains common schema classes meant to be used across the dnet-hadoop submodules</description>

View File

@ -63,6 +63,8 @@ public class ModelConstants {
public static final String HARVESTED = "Harvested";
public static final String PROVENANCE_DEDUP = "sysimport:dedup";
public static final String PROVENANCE_ENRICH = "sysimport:enrich";
public static final Qualifier PROVENANCE_ACTION_SET_QUALIFIER = qualifier(
SYSIMPORT_ACTIONSET, SYSIMPORT_ACTIONSET, DNET_PROVENANCE_ACTIONS, DNET_PROVENANCE_ACTIONS);

View File

@ -5,9 +5,11 @@ import java.io.Serializable;
import java.util.Comparator;
import java.util.List;
import java.util.Objects;
import java.util.Optional;
import java.util.stream.Collectors;
import eu.dnetlib.dhp.schema.common.AccessRightComparator;
import eu.dnetlib.dhp.schema.common.ModelConstants;
public class Result extends OafEntity implements Serializable {
@ -233,6 +235,13 @@ public class Result extends OafEntity implements Serializable {
this.instance = instance;
}
private static boolean isAnEnrichment(OafEntity e) {
return e.getDataInfo()!= null &&
e.getDataInfo().getProvenanceaction()!= null
&& ModelConstants.PROVENANCE_ENRICH.equalsIgnoreCase(e.getDataInfo().getProvenanceaction().getClassid());
}
@Override
public void mergeFrom(OafEntity e) {
super.mergeFrom(e);