Giambattista Bloisi
|
5e15f20e6e
|
Fix entityMerger that was excluding the authors of the first entity in the list to merge
|
2023-07-21 00:46:54 +02:00 |
Giambattista Bloisi
|
dba34505de
|
Fix SparkStatsTest bug where parquet tables were incorrectly read as text files leading to unpredictable count() values
|
2023-07-19 14:24:52 +02:00 |
Giambattista Bloisi
|
bd3fcf869a
|
rename dnet-pace-core into dhp-pace-core module and use it as dependency in other modules
|
2023-07-06 10:02:23 +02:00 |
Sandro La Bruzzo
|
9963fd6d29
|
updated log to add subentity
|
2023-06-28 13:36:05 +02:00 |
Sandro La Bruzzo
|
ed7e2ab6d1
|
reverted mistake on commit workflow.xml
|
2023-06-28 11:40:19 +02:00 |
Sandro La Bruzzo
|
9910ce06ae
|
added to CreateSimRel the feature to write time log
|
2023-06-28 11:38:16 +02:00 |
Sandro La Bruzzo
|
bd17c3edc8
|
added to CreateSimRel the feature to write time log
|
2023-06-28 11:20:58 +02:00 |
Claudio Atzori
|
909729a2fc
|
[dedup] tweaking num partitions, minor changes
|
2023-05-17 10:16:22 +02:00 |
Claudio Atzori
|
062abfd669
|
fixed NPE, removed unused stuff
|
2022-12-06 12:04:00 +01:00 |
Claudio Atzori
|
0aa725083f
|
extended dedup testing
|
2022-11-17 16:13:43 +01:00 |
Claudio Atzori
|
3dbc637d3e
|
code formatting
|
2022-11-17 09:55:41 +01:00 |
Claudio Atzori
|
ddff0e8999
|
merging duplicates using IdentifierComparator
|
2022-11-11 16:10:25 +01:00 |
Claudio Atzori
|
5af5a8ae42
|
added IdentifierComparator
|
2022-11-09 14:20:59 +01:00 |
Claudio Atzori
|
c26222623f
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 13:32:22 +02:00 |
Claudio Atzori
|
86585a6b27
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 13:32:19 +02:00 |
Claudio Atzori
|
ad85d88eaf
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 13:28:35 +02:00 |
Claudio Atzori
|
598e11dfd7
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 13:27:02 +02:00 |
Claudio Atzori
|
db3d9877a5
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 13:26:58 +02:00 |
Claudio Atzori
|
3bba6d6e38
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 12:23:17 +02:00 |
Claudio Atzori
|
2ac2d928bd
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 12:18:47 +02:00 |
Claudio Atzori
|
85bc722ff4
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 12:18:43 +02:00 |
Claudio Atzori
|
bc05b6168a
|
[maven-release-plugin] rollback the release of dhp-1.2.4
|
2022-04-07 11:49:06 +02:00 |
Claudio Atzori
|
505420fd61
|
[maven-release-plugin] prepare for next development iteration
|
2022-04-07 11:34:06 +02:00 |
Claudio Atzori
|
66e718981e
|
[maven-release-plugin] prepare release dhp-1.2.4
|
2022-04-07 11:34:02 +02:00 |
Claudio Atzori
|
61319b2e83
|
updated dhp-schema version; set entity-level dataInfo before & after merging the fields from the group of duplicates
|
2022-03-25 16:38:33 +01:00 |
miconis
|
c959639bd5
|
dependency updated to the new pace-core version
|
2022-03-15 16:33:03 +01:00 |
miconis
|
8991d097b4
|
bug fix in the DedupRecordFactory, DataInfo set before merge
|
2022-02-24 17:13:12 +01:00 |
Claudio Atzori
|
391aa1373b
|
added unit test
|
2022-01-19 17:13:21 +01:00 |
Claudio Atzori
|
44a937f4ed
|
factored out entity grouping implementation, extended to consider results from delegated authorities rather than identical records from other sources
|
2022-01-19 12:24:52 +01:00 |
Claudio Atzori
|
f4538f3c4c
|
cleanup
|
2021-11-19 11:33:10 +01:00 |
Claudio Atzori
|
2b46b87f56
|
fixed filtering criteria applied in SparkCopyRelationsNoOpenorgs to keep the parent/child relations from OpenOrgs
|
2021-11-19 11:30:29 +01:00 |
Claudio Atzori
|
a24b9f8268
|
[dedup] trivial refactoring
|
2021-11-18 17:12:02 +01:00 |
Claudio Atzori
|
c0750fb17c
|
avoid non necessary count operations over large spark datasets
|
2021-11-18 17:11:31 +01:00 |
Claudio Atzori
|
0a727d325d
|
[dedup] increased number of partitions in the consistency phase
|
2021-11-16 08:43:41 +01:00 |
miconis
|
611ca511db
|
set configuration property in openorgs duplicates wf
|
2021-10-07 15:39:55 +02:00 |
miconis
|
9646b9fd98
|
implementation of the http call for the update of openorgs suggestions
|
2021-10-07 11:29:11 +02:00 |
miconis
|
853333bdde
|
implementation of the whitelist for similarity relations
|
2021-09-20 16:21:47 +02:00 |
Claudio Atzori
|
9f4db73f30
|
updated/fixed unit tests
|
2021-08-11 15:02:51 +02:00 |
Claudio Atzori
|
2ee21da43b
|
suggestions from SonarLint
|
2021-08-11 12:13:22 +02:00 |
Claudio Atzori
|
2fff24df55
|
code formatting
|
2021-07-28 11:34:19 +02:00 |
Sandro La Bruzzo
|
3920c69bc8
|
change implementation of resolve Relation to generate jsonRdd in output
|
2021-07-25 09:51:36 +02:00 |
Sandro La Bruzzo
|
058b636d4d
|
added control to check if the entity exists
|
2021-07-22 16:08:54 +02:00 |
Claudio Atzori
|
41b551562e
|
applying PR#115 (DatePicker) on stable_ids
|
2021-06-17 09:33:50 +02:00 |
Claudio Atzori
|
23b8883ab1
|
applied intellij code cleanup
|
2021-05-14 10:58:12 +02:00 |
Claudio Atzori
|
5afa7d3e0c
|
core utilities in dhp-common moved in external module dhp-schemas
|
2021-04-27 15:44:01 +02:00 |
Claudio Atzori
|
27ab8a704d
|
adjusted poms to align with the external dhp-schema module
|
2021-04-27 10:12:27 +02:00 |
Claudio Atzori
|
ef4bfd82e2
|
code formatting
|
2021-04-27 10:09:31 +02:00 |
Claudio Atzori
|
faa8f6f4e2
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-04-27 09:57:03 +02:00 |
miconis
|
6d5c14e030
|
assertions updated in entity merger test
|
2021-04-27 09:47:49 +02:00 |
Claudio Atzori
|
c2bb03c8b5
|
depending on external dhp-schemas module
|
2021-04-23 17:57:35 +02:00 |