Giambattista Bloisi gbloisia93fb
  • Joined on 2022-06-08
gbloisia93fb created pull request D-Net/dnet-dedup#2 2023-06-28 12:08:35 +02:00
Fix Sliding window: sliding window logic was not applied because a counter was not incremented
gbloisia93fb pushed to enable_sliding_windows at D-Net/dnet-dedup 2023-06-28 12:06:22 +02:00
85546dfa2f Fix Sliding window: sliding window logic was not applied because a counter was not incremented
gbloisia93fb created branch enable_sliding_windows in D-Net/dnet-dedup 2023-06-28 12:06:22 +02:00
gbloisia93fb pushed to dedup-with-dataframe at D-Net/dnet-hadoop 2023-06-26 21:00:13 +02:00
2f8651e0aa Allow processing of immutable sorted blocks in dedup
gbloisia93fb pushed to dedup-with-dataframe at D-Net/dnet-hadoop 2023-06-26 18:37:31 +02:00
02d61023cf Fix maven dependencies warning while building
gbloisia93fb created pull request D-Net/dnet-hadoop#316 2023-06-26 10:00:44 +02:00
[WIP] Refactor Dedup using Spark Dataframe API and Spark Row representation of data, misc optimizations
gbloisia93fb pushed to dedup-with-dataframe at D-Net/dnet-hadoop 2023-06-26 09:53:15 +02:00
d041f6d2be Move dnet-pace-core inside the project
467693bfcb Move dnet-pace-core inside the project
b09aba91e3 Update copyDataToImpalaCluster.sh
317ae7b33a Bug fixes
454ec4d8b0 [aggregator graph] added column alias when mapping organization PIDs from the OpenOrgs database
Compare 18 commits »
gbloisia93fb created pull request D-Net/dnet-hadoop#310 2023-06-19 13:20:05 +02:00
Remove duplicated code and ensure that load and initialization is done through "DedupConfig.load" method
gbloisia93fb pushed to optimized-dedup at D-Net/dnet-hadoop 2023-06-19 13:12:15 +02:00
5b6c361fe0 Remove duplicated code and ensure that load and initialization is done through "DedupConfig.load" method
gbloisia93fb created branch optimized-dedup in D-Net/dnet-hadoop 2023-06-19 13:12:15 +02:00
gbloisia93fb pushed to master at D-Net/dnet-hadoop 2023-06-19 13:09:14 +02:00
758e662ab8 Revert "REmove duplicated code and ensure that load and initialization is done through "DedupConfig.load" method"
gbloisia93fb pushed to master at D-Net/dnet-hadoop 2023-06-19 13:01:06 +02:00
485f9d18cb REmove duplicated code and ensure that load and initialization is done through "DedupConfig.load" method
gbloisia93fb created pull request D-Net/dnet-dedup#1 2023-06-16 10:30:10 +02:00
Precompile blacklists patterns before evaluating clustering criteria
gbloisia93fb pushed to optimized-clustering at D-Net/dnet-dedup 2023-06-16 10:02:20 +02:00
d2d173773e Precompile blacklists patterns before evaluating clustering criteria
gbloisia93fb created branch optimized-clustering in D-Net/dnet-dedup 2023-06-16 10:02:19 +02:00