Commit Graph

842 Commits

Author SHA1 Message Date
Miriam Baglioni dd2e698a72 added a sequentialization step on the spark job. Addedd new parameter 2020-05-05 17:03:43 +02:00
Claudio Atzori 0825321d0b improved unit tests in dhp-aggregation 2020-05-05 12:39:04 +02:00
Miriam Baglioni 252b219dd5 chanced the name of some properties 2020-05-05 10:03:32 +02:00
Claudio Atzori 4a8487165c using long param names in wf definition 2020-05-04 19:19:29 +02:00
Claudio Atzori a2fc37df5f adjusted parameters 2020-05-04 19:18:59 +02:00
Claudio Atzori f1b7e14036 code formatting 2020-05-04 19:18:34 +02:00
Claudio Atzori 405f495d54 code formatting 2020-05-04 19:18:12 +02:00
Claudio Atzori de5fbe325c bits of javadoc 2020-05-04 16:00:48 +02:00
Miriam Baglioni 78578c3ccf fixed wrong trnasition name in workflow 2020-05-04 15:46:24 +02:00
Miriam Baglioni cc7d9b6b19 merge upstream 2020-05-04 13:59:09 +02:00
Miriam Baglioni 3957c815b9 changed the name of some parameters 2020-05-04 13:58:52 +02:00
Miriam Baglioni e218360f8a changed code for the mode of DbClient and also removed the dependency to graph-mapper 2020-05-04 12:26:17 +02:00
Miriam Baglioni 31ea05297d moved the DbClient to common and added needed dependency to pom 2020-05-04 12:22:28 +02:00
miconis 085cf173d7 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-05-04 12:08:20 +02:00
miconis 3df703f67d mergerels added to propagate relations 2020-05-04 12:08:12 +02:00
Claudio Atzori bac37b3973 fixed children expansion in XML records 2020-05-04 11:51:17 +02:00
Claudio Atzori 077ccd8743 stats wf properties cleanup 2020-05-04 11:41:46 +02:00
Miriam Baglioni b7dd400e51 added check if author.pid exists or is null 2020-05-01 15:09:02 +02:00
Miriam Baglioni dbf3ba051a minor 2020-04-30 20:22:07 +02:00
Miriam Baglioni 43053a286d workflow pom with added blacklist module 2020-04-30 18:30:21 +02:00
Miriam Baglioni 0631fe548a pom.xml 2020-04-30 18:29:46 +02:00
Miriam Baglioni 38ecfd5785 the wf with all the three steps for blacklisting relations 2020-04-30 18:28:46 +02:00
Miriam Baglioni 95433e1087 parameters for the preparation phase and blacklist phase 2020-04-30 18:28:13 +02:00
Miriam Baglioni 1070790c19 minor 2020-04-30 18:26:58 +02:00
Miriam Baglioni b9d56b3ced applies the actual removal of the relations 2020-04-30 18:26:25 +02:00
Miriam Baglioni d6d6ebeae5 preparation step: creates the subset of the merges relations 2020-04-30 18:25:33 +02:00
Miriam Baglioni 13f30664ea minor 2020-04-30 15:23:49 +02:00
Miriam Baglioni 276b95b7b3 add create file instruction 2020-04-30 15:05:17 +02:00
Miriam Baglioni 65a5d67b8b minor modifications 2020-04-30 14:45:27 +02:00
Miriam Baglioni 418595fec2 removed the saveGraph parameter 2020-04-30 14:45:00 +02:00
Miriam Baglioni ce8b1d0bc3 new workflow definition to be inserted in the provision pipeline 2020-04-30 14:38:54 +02:00
Miriam Baglioni 4b0bd91012 - 2020-04-30 12:45:28 +02:00
Miriam Baglioni 2349bfd8b8 changed the job test to remove the writeUpdate option 2020-04-30 11:43:33 +02:00
Miriam Baglioni 951517f9ec new input parameters and workflow definition to be used in the provision pipeline 2020-04-30 11:32:50 +02:00
Miriam Baglioni 026f297e49 removed the writeUpdate oprion 2020-04-30 11:31:59 +02:00
Miriam Baglioni c89fe762b1 modified relation datasource organization 2020-04-30 11:17:03 +02:00
Miriam Baglioni 3abb76ff7a merge with upstream 2020-04-30 11:15:54 +02:00
Michele Artini eb9bd42970 fixed a problem with journals 2020-04-30 11:06:05 +02:00
Miriam Baglioni 638a3c465b - 2020-04-30 11:05:17 +02:00
Michele Artini a0a6109bbc fixed a problem with journals 2020-04-30 11:03:46 +02:00
Miriam Baglioni 354f0162be changes in the blacklist and workflow definition 2020-04-30 10:26:50 +02:00
Miriam Baglioni 564e5d6279 added new information in support of blacklist reader 2020-04-30 10:22:58 +02:00
Claudio Atzori 439c6255a2 cleanup 2020-04-29 19:09:07 +02:00
Claudio Atzori 77ac995770 cleaned up poms, added descriptions 2020-04-29 18:44:17 +02:00
Miriam Baglioni 3cffee74b9 merge with upstream 2020-04-29 18:25:29 +02:00
Miriam Baglioni 9ab46535e7 pom with the new blacklist module added 2020-04-29 18:17:15 +02:00
Miriam Baglioni 6a47e6191d read from blacklist and write the result as relations on hdfs 2020-04-29 18:16:01 +02:00
Miriam Baglioni 869f576273 added hash map for relationship entityType id prefix, and relation inverse 2020-04-29 18:14:52 +02:00
Miriam Baglioni b85ad7012a reads the blacklist from the blacklist db and writes it as a set of relations on hdfs 2020-04-29 17:29:49 +02:00
Claudio Atzori 64d790a266 updated maven plugin dependencies 2020-04-29 16:56:18 +02:00