Claudio Atzori
|
f3a85e224b
|
merged from branch beta the bulk tagging (single step, negative constraints), the cleanig worflow (single step, pid type based cleaning), instance level fulltext
|
2023-06-28 13:33:57 +02:00 |
Giambattista Bloisi
|
758e662ab8
|
Revert "REmove duplicated code and ensure that load and initialization is done through "DedupConfig.load" method"
This reverts commit 485f9d18cb .
|
2023-06-19 13:08:10 +02:00 |
Giambattista Bloisi
|
485f9d18cb
|
REmove duplicated code and ensure that load and initialization is done through "DedupConfig.load" method
|
2023-06-19 13:00:02 +02:00 |
Claudio Atzori
|
c26222623f
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 13:32:22 +02:00 |
Claudio Atzori
|
86585a6b27
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 13:32:19 +02:00 |
Claudio Atzori
|
ad85d88eaf
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 13:28:35 +02:00 |
Claudio Atzori
|
598e11dfd7
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 13:27:02 +02:00 |
Claudio Atzori
|
db3d9877a5
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 13:26:58 +02:00 |
Claudio Atzori
|
3bba6d6e38
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 12:23:17 +02:00 |
Claudio Atzori
|
2ac2d928bd
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 12:18:47 +02:00 |
Claudio Atzori
|
85bc722ff4
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 12:18:43 +02:00 |
Claudio Atzori
|
bc05b6168a
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 11:49:06 +02:00 |
Claudio Atzori
|
505420fd61
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 11:34:06 +02:00 |
Claudio Atzori
|
66e718981e
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 11:34:02 +02:00 |
Claudio Atzori
|
61319b2e83
|
updated dhp-schema version; set entity-level dataInfo before & after merging the fields from the group of duplicates
|
2022-03-25 16:38:33 +01:00 |
miconis
|
c959639bd5
|
dependency updated to the new pace-core version
|
2022-03-15 16:33:03 +01:00 |
miconis
|
8991d097b4
|
bug fix in the DedupRecordFactory, DataInfo set before merge
|
2022-02-24 17:13:12 +01:00 |
Claudio Atzori
|
391aa1373b
|
added unit test
|
2022-01-19 17:13:21 +01:00 |
Claudio Atzori
|
44a937f4ed
|
factored out entity grouping implementation, extended to consider results from delegated authorities rather than identical records from other sources
|
2022-01-19 12:24:52 +01:00 |
Claudio Atzori
|
f4538f3c4c
|
cleanup
|
2021-11-19 11:33:10 +01:00 |
Claudio Atzori
|
2b46b87f56
|
fixed filtering criteria applied in SparkCopyRelationsNoOpenorgs to keep the parent/child relations from OpenOrgs
|
2021-11-19 11:30:29 +01:00 |
Claudio Atzori
|
a24b9f8268
|
[dedup] trivial refactoring
|
2021-11-18 17:12:02 +01:00 |
Claudio Atzori
|
c0750fb17c
|
avoid non necessary count operations over large spark datasets
|
2021-11-18 17:11:31 +01:00 |
Claudio Atzori
|
0a727d325d
|
[dedup] increased number of partitions in the consistency phase
|
2021-11-16 08:43:41 +01:00 |
miconis
|
611ca511db
|
set configuration property in openorgs duplicates wf
|
2021-10-07 15:39:55 +02:00 |
miconis
|
9646b9fd98
|
implementation of the http call for the update of openorgs suggestions
|
2021-10-07 11:29:11 +02:00 |
miconis
|
853333bdde
|
implementation of the whitelist for similarity relations
|
2021-09-20 16:21:47 +02:00 |
Claudio Atzori
|
9f4db73f30
|
updated/fixed unit tests
|
2021-08-11 15:02:51 +02:00 |
Claudio Atzori
|
2ee21da43b
|
suggestions from SonarLint
|
2021-08-11 12:13:22 +02:00 |
Claudio Atzori
|
2fff24df55
|
code formatting
|
2021-07-28 11:34:19 +02:00 |
Sandro La Bruzzo
|
3920c69bc8
|
change implementation of resolve Relation to generate jsonRdd in output
|
2021-07-25 09:51:36 +02:00 |
Sandro La Bruzzo
|
058b636d4d
|
added control to check if the entity exists
|
2021-07-22 16:08:54 +02:00 |
Claudio Atzori
|
41b551562e
|
applying PR#115 (DatePicker) on stable_ids
|
2021-06-17 09:33:50 +02:00 |
Claudio Atzori
|
23b8883ab1
|
applied intellij code cleanup
|
2021-05-14 10:58:12 +02:00 |
Claudio Atzori
|
5afa7d3e0c
|
core utilities in dhp-common moved in external module dhp-schemas
|
2021-04-27 15:44:01 +02:00 |
Claudio Atzori
|
27ab8a704d
|
adjusted poms to align with the external dhp-schema module
|
2021-04-27 10:12:27 +02:00 |
Claudio Atzori
|
ef4bfd82e2
|
code formatting
|
2021-04-27 10:09:31 +02:00 |
Claudio Atzori
|
faa8f6f4e2
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-04-27 09:57:03 +02:00 |
miconis
|
6d5c14e030
|
assertions updated in entity merger test
|
2021-04-27 09:47:49 +02:00 |
Claudio Atzori
|
c2bb03c8b5
|
depending on external dhp-schemas module
|
2021-04-23 17:57:35 +02:00 |
miconis
|
3c12eeadce
|
bug fix in propagation of relations
|
2021-04-22 11:44:33 +02:00 |
Claudio Atzori
|
8f309b72ff
|
[dedup] using node names consistently across the workflow
|
2021-04-21 17:54:51 +02:00 |
Claudio Atzori
|
815b9f4d56
|
[openorgs dedup] fixed workflow parameter declarations. Introduced support for resuming the execution from intermediate steps
|
2021-04-20 17:24:45 +02:00 |
Claudio Atzori
|
45057440c1
|
code formatting
|
2021-04-16 17:28:25 +02:00 |
miconis
|
7ad573d023
|
bug fix: changed join in propagaterelations without applying filter on the id
|
2021-04-16 16:40:42 +02:00 |
miconis
|
f64e57c112
|
refactoring of the id generation, sparkcreatemergerels collects entities to create root id after a join
|
2021-04-15 10:59:24 +02:00 |
miconis
|
3525a8f504
|
id generation of representative record moved to the SparkCreateMergeRel job
|
2021-04-14 18:06:07 +02:00 |
miconis
|
1542196a33
|
bug fix: starting node of duplicate scan wf changed
|
2021-04-13 10:15:43 +02:00 |
miconis
|
369ed1cd8a
|
bug fix: lookupurl parameter added to dedup record job
|
2021-04-13 09:08:05 +02:00 |
Claudio Atzori
|
511c0521e5
|
[dedup] avoiding NPEs handling OpenOrg relations
|
2021-04-12 17:45:11 +02:00 |