Commit Graph

8 Commits

Author SHA1 Message Date
Giambattista Bloisi 2caaaec42d Include SparkCleanRelation logic in SparkPropagateRelation
SparkPropagateRelation includes merge relations
Revised tests for SparkPropagateRelation
2023-09-04 11:33:20 +02:00
Giambattista Bloisi e64c2854a3 Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Claudio Atzori f4538f3c4c cleanup 2021-11-19 11:33:10 +01:00
Claudio Atzori 2b46b87f56 fixed filtering criteria applied in SparkCopyRelationsNoOpenorgs to keep the parent/child relations from OpenOrgs 2021-11-19 11:30:29 +01:00
Claudio Atzori 2ee21da43b suggestions from SonarLint 2021-08-11 12:13:22 +02:00
miconis 3c12eeadce bug fix in propagation of relations 2021-04-22 11:44:33 +02:00
miconis 369ed1cd8a bug fix: lookupurl parameter added to dedup record job 2021-04-13 09:08:05 +02:00
miconis 0857100fb8 implementation of the tests for the openorgs integration in the openaire provision 2021-04-07 18:42:16 +02:00