Commit Graph

134 Commits

Author SHA1 Message Date
Claudio Atzori abe8fb69a2 added global properties, moved postprocessing script inside the oozie_app directory 2020-03-18 15:43:54 +01:00
Claudio Atzori 2f11e37602 fixed expansion of path variables 2020-03-17 19:41:07 +01:00
Claudio Atzori 2795b0b096 no need to mkdir a the all_entities file 2020-03-17 17:22:14 +01:00
Claudio Atzori 19746ad308 when reuseContent, reset ${workingPath}/all_entities 2020-03-17 17:17:06 +01:00
Claudio Atzori 2f0c85eeb3 updated parameters for regular_all_steps worfklow, introduced flag 'reuseContent' 2020-03-17 17:04:58 +01:00
Claudio Atzori b8290b5851 updated parameters for regular_all_steps worfklow 2020-03-17 15:45:30 +01:00
Claudio Atzori 4706f24ec5 updated parameters for regular_all_steps worfklow 2020-03-17 15:23:54 +01:00
Claudio Atzori 9c84e21b87 added workflow to migrate latest version of each actionset content from DM to OCEAN cluster, mapping the targetValues from the old protobuf data model to the dhp.OAF datamodel 2020-03-13 15:56:52 +01:00
Michele Artini b6efa9d6ab Configuration of the SequenceFile Writer 2020-03-05 15:49:14 +01:00
Michele Artini 4b29a121b0 migration using spark in step2 2020-03-02 16:12:14 +01:00
Michele Artini 5445a57102 migration using spark in step2 2020-03-02 16:11:59 +01:00
Michele Artini 93665773ea Fixed a problem with JavaRDD Union 2020-02-25 15:59:21 +01:00
Michele Artini 5d3739b5cf migration of claims 2020-02-19 15:11:17 +01:00
Michele Artini 173f1df1e5 saved a query for openaire production database 2020-02-19 10:15:08 +01:00
Sandro La Bruzzo 9a2d74ac82 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-02-19 10:13:45 +01:00
Sandro La Bruzzo e5d7cdf422 fixed sql query 2020-02-19 10:13:36 +01:00
Claudio Atzori 6a288625e5 fixed workflow outgoing node 2020-02-17 15:04:33 +01:00
Sandro La Bruzzo 76ee85141a added oozie job for DNET migration and implemented Spark job for extracting entities 2020-02-17 12:31:44 +01:00
Michele Artini cdea0dae75 bug fixing 2020-02-12 16:34:00 +01:00
Michele Artini 06c2fd6df9 bug fixing 2020-02-11 15:29:50 +01:00
Michele Artini 5fc09b179c bug fixing 2020-02-11 12:48:03 +01:00
Michele Artini 95740767e0 Ready for tests 2020-02-10 16:04:06 +01:00
Michele Artini 6bfe2dc96e partial implementation 2020-01-22 16:00:23 +01:00
Michele Artini cd114f1c3b partial update 2020-01-21 12:32:10 +01:00
Michele Artini 81f82b5d34 partial implementation of applications to migrate entities 2020-01-17 15:26:21 +01:00
Sandro La Bruzzo 5a8a323f2a dhp-collection-worker integrated in dhp-workflows 2019-10-24 11:36:59 +02:00
Claudio Atzori c7654b6fe3 renamed collection & transformation oozie workflow files 2019-10-18 09:42:20 +02:00
Claudio Atzori 27db5afdad integrating the oozie workflow build/deploy/run mechanism, took inspiration from iis 2019-10-17 18:38:30 +02:00
Sandro La Bruzzo bbb87d0e3d implemented saxonHE on transformation spark job 2019-10-10 11:33:51 +02:00
Sandro La Bruzzo 4b8c7c279d Added documentation on a class, and reused ArgumetApplicationParser on dhp-aggregation 2019-10-07 17:02:53 +02:00
Sandro La Bruzzo 403c13eebf Implemented message manager, Fixed bug on collection worker, implemented Collecion and Transform spark job 2019-04-11 15:39:29 +02:00
Sandro La Bruzzo ded6aef5e1 moved collector worker 2019-04-03 16:05:16 +02:00
Sandro La Bruzzo 12c65eab4c implemented command line 2019-03-25 15:18:31 +01:00
Sandro La Bruzzo e67d9ee1a9 added first implementation of dnet-workflows 2019-03-18 10:44:35 +01:00