Giambattista Bloisi
e826591897
Format scala code
2023-09-22 15:21:07 +02:00
Giambattista Bloisi
727ccbe575
Add profiles for different spark versions: spark-24, spark-34, spark-35
2023-09-21 14:25:26 +02:00
Sandro La Bruzzo
8064abf86c
formatted code
2023-09-18 12:57:44 +02:00
Giambattista Bloisi
81ab6a3991
Changes requires to build and run tests with Java 17
2023-09-07 11:58:59 +02:00
Giambattista Bloisi
ff297cdc9b
Changes in maven poms to build and test the project using Spark 3.4.x and scala 2.12
2023-09-06 17:54:11 +02:00
Claudio Atzori
bf35280ea6
code formatting
2023-08-29 11:11:00 +02:00
Giambattista Bloisi
e64c2854a3
Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
...
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Giambattista Bloisi
bb5b845e3c
Use scala.binary.version property to resolve scala maven dependencies
...
Ensure consistent usage of maven properties
Profile for compiling with scala 2.12 and Spark 3.4
2023-07-24 11:13:48 +02:00
Giambattista Bloisi
801da2fd4a
New sources formatted by maven plugin
2023-07-06 10:28:53 +02:00
Giambattista Bloisi
bd3fcf869a
rename dnet-pace-core into dhp-pace-core module and use it as dependency in other modules
2023-07-06 10:02:23 +02:00