Commit Graph

65 Commits

Author SHA1 Message Date
Claudio Atzori 48b580b45c [graph enrichment] fixed country_propagation oozie workflow definition, parameter saveGraph is not needed anymore by the SparkCountryPropagationJob 2022-04-11 08:52:36 +02:00
Miriam Baglioni 7b8f85692e [Enrichment country] fixed issues with parameters and workflow args 2022-03-23 17:20:23 +01:00
Claudio Atzori f10066547b increased spark.sql.shuffle.partitions in affiliation_from_semrel_propagation 2022-03-23 12:22:26 +01:00
Miriam Baglioni 2b643059fa [Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself 2022-03-11 13:56:48 +01:00
Miriam Baglioni 064f9bbd87 [AFFPropSR] added new paprameter for the number of iterations and new code for just one iteration 2022-01-07 18:58:51 +01:00
Miriam Baglioni 28ea532ece [Affilaition Propagation] moved the selection of graph relation as a preparation step 2021-11-16 15:24:19 +01:00
Miriam Baglioni b9d124bb7c [Enrichment: Propagation through parent-child relationships] Added counters, and changed constraint to verify if filtering out the relation (from classname = harvested to classid != propagation) 2021-11-03 13:55:37 +01:00
Miriam Baglioni 09f36cffb8 [Enrichment: Propagation through parent-child relationships] First implementation, testing, and wf for propagation of result to organization through semantic relation 2021-10-29 11:20:03 +02:00
Claudio Atzori b695932ae4 integrated pull#108 2021-05-20 15:34:04 +02:00
Miriam Baglioni bc6b5d5b34 removed leftover parameter 2020-08-15 11:22:35 +02:00
Miriam Baglioni 200cd5c730 removed leftover parameter 2020-08-15 11:22:19 +02:00
Claudio Atzori ab37953332 added global properties in wf definitions to avoid repeating name-node and job-tracker in the (many) distcp actions; reintroduced output directory removal at the beginning of each spark action 2020-05-14 10:25:41 +02:00
Miriam Baglioni 43f127448d changed the package name from dhp-propagation to dhp-enrichment for the preparation phase of funding propagation 2020-05-12 18:24:26 +02:00
Claudio Atzori ec0782e582 renamed jar containing the bulktagging and propagation workflows from dhp-[bulktagging|propagation] to dhp-enrichment; adjusted xml formatting 2020-05-12 15:49:28 +02:00
Claudio Atzori 6d0b11252e bulktagging wfs moved into common dhp-enrichment module 2020-05-11 17:32:06 +02:00