Commit Graph

38 Commits

Author SHA1 Message Date
Claudio Atzori 5111671e62 celanup 2020-05-07 11:47:00 +02:00
Claudio Atzori 5b3f8a0e90 using Encoders.bean instead of kryo 2020-05-07 11:41:41 +02:00
Claudio Atzori 73243793b2 Dataset based implementation for SparkCountryPropagationJob3 2020-05-07 11:15:24 +02:00
Miriam Baglioni 29bc8c44b1 changes in the construction of new country set 2020-05-07 10:01:34 +02:00
Miriam Baglioni 16193cf0ba new workflow and parameter for country propagation 2020-05-07 09:59:58 +02:00
Miriam Baglioni 42ad51577a new implementation with one more serialization step 2020-05-07 09:57:49 +02:00
Miriam Baglioni dd2e698a72 added a sequentialization step on the spark job. Addedd new parameter 2020-05-05 17:03:43 +02:00
Miriam Baglioni 638a3c465b - 2020-04-30 11:05:17 +02:00
Miriam Baglioni 95a54d5460 removed the writeUpdate option. The update is available in the preparedInfo path 2020-04-27 10:30:32 +02:00
Miriam Baglioni fa2ff5c6f5 refactoring 2020-04-23 11:58:26 +02:00
Miriam Baglioni b46b080ddc use mergeFrom method call to add the country(ies) instead of modify the result directly. 2020-04-17 16:50:54 +02:00
Miriam Baglioni c4987dd12a minor 2020-04-17 16:49:08 +02:00
Miriam Baglioni fd5d792e35 refactoring 2020-04-16 15:53:34 +02:00
Miriam Baglioni 3577219127 removed unuseful classes 2020-04-15 12:45:49 +02:00
Miriam Baglioni 27f1d3ee8f minor refactoring 2020-04-15 12:21:05 +02:00
Miriam Baglioni 8f12292daa changed the way to save the results on filesystem 2020-04-11 16:47:34 +02:00
Miriam Baglioni aef9b3aa90 new parametric implementation of country propagation. Exploits information compute before and broadcasts it to each executor 2020-04-11 16:36:59 +02:00
Miriam Baglioni a2d833d5dd step of data preparation before actual country propagation will take palce 2020-04-11 16:36:03 +02:00
Miriam Baglioni 6897c920a2 classes in support of new implementation of country propagation 2020-04-11 16:35:26 +02:00
Miriam Baglioni 627ad58a8b new wf definition 2020-04-09 11:33:19 +02:00
Miriam Baglioni a2d309545b new parametrized implementation for country propagation 2020-04-08 19:12:59 +02:00
Miriam Baglioni 6dfdba9ef7 new parametrized implementation for country propagation 2020-04-08 18:14:37 +02:00
Miriam Baglioni 03f7cb6402 new parametrized implementation for country propagation 2020-04-08 18:08:41 +02:00
Miriam Baglioni 540da4ab61 new busuness logic with prepared info before actual job run 2020-04-08 13:04:04 +02:00
Miriam Baglioni 2afe971816 new implementation for country propagatio 2020-04-08 10:49:09 +02:00
Miriam Baglioni 9418e3d4fa read dataset from files instead of using hive tables 2020-03-23 17:09:27 +01:00
Miriam Baglioni 8ab8b6b0bf minor 2020-03-23 14:35:23 +01:00
Miriam Baglioni a440152b46 refactoring 2020-03-23 14:30:56 +01:00
Miriam Baglioni 47561f3597 changed the implementation from rdd to dataset got from sql queries (on hive) 2020-03-23 11:58:32 +01:00
Miriam Baglioni 67ea3cf3ed changed the way to read the file with info on resource or relation. From sequenceFile to textFile 2020-03-17 16:32:05 +01:00
Miriam Baglioni b50166b9ad None 2020-02-28 18:25:28 +01:00
Miriam Baglioni ab84163bb3 added set accumulator in TypedRow and used it to acucmulate country information in Country Propagation 2020-02-19 15:02:50 +01:00
Miriam Baglioni b736a9581c changed relclass and reltype in reelation specification for country propagation and implementation of propagation of result affiliation through institutional repositories 2020-02-18 17:27:28 +01:00
Miriam Baglioni e0a777028a fix problem in parameters 2020-02-18 17:23:34 +01:00
Miriam Baglioni bd0e504b42 changes to the wf configuration 2020-02-17 18:04:15 +01:00
Miriam Baglioni 3a9d723655 adding default parameters in code 2020-02-17 16:30:52 +01:00
Miriam Baglioni a5517eee35 adding the mkdirs for creation of propagation folder under provision on tmp 2020-02-17 14:20:42 +01:00
Miriam Baglioni c7bc73aedf country propagation for results collected from institutional repositories 2020-02-17 11:44:48 +01:00