Commit Graph

2460 Commits

Author SHA1 Message Date
Claudio Atzori 646dab7f68 trying to avoid NPEs 2021-01-22 18:24:34 +01:00
Claudio Atzori 34d653de41 [Cleaning] updated cleaning rule for DOIs 2021-01-22 14:16:33 +01:00
miconis 8fea29177c refactoring, minor changes and implementation of the wf for openorgs with integration of organization phases into the scan wf 2021-01-18 16:48:08 +01:00
Dimitris 3e8d2a6b2d Clean workflows 2021-01-15 16:19:12 +02:00
Michele Artini f667e94a31 Merge pull request 'broker' (#88) from broker into master 2021-01-14 14:48:13 +01:00
Michele Artini cfbcdc95bc fixed a wf param 2021-01-14 14:45:23 +01:00
Michele Artini 69ba3203c0 fixed a conflict 2021-01-14 14:43:25 +01:00
Michele Artini fafb5b2e08 Merge branch 'broker' of code-repo.d4science.org:D-Net/dnet-hadoop into broker 2021-01-14 14:32:42 +01:00
Michele Artini b230d44411 fixed conflict 2021-01-14 14:32:31 +01:00
Michele Artini b9d90e95b8 Added eventId to ShortEventMessage 2021-01-14 14:32:31 +01:00
Michele Artini 64b0b0bfb3 fixed a bug with invalid subject topic 2021-01-14 14:32:31 +01:00
Michele Artini e3e0ab1de1 fixed a problem with join 2021-01-14 14:32:31 +01:00
Michele Artini 26a941315a openaireId 2021-01-14 14:32:31 +01:00
Michele Artini 6f4d1a37f0 ES wf properties 2021-01-14 14:32:31 +01:00
Michele Artini 1391341d06 mkdir of output dir 2021-01-14 14:32:31 +01:00
Michele Artini 3c9cbd19f3 whitelist of topics 2021-01-14 14:32:31 +01:00
Michele Artini 467aa77279 workingDir and outputDir 2021-01-14 14:32:31 +01:00
Michele Artini 10f3f7eca7 workingDir and outputDir 2021-01-14 14:32:31 +01:00
Michele Artini ff41a7b3a4 gzipped output 2021-01-14 14:32:31 +01:00
Michele Artini 223fa660cb fixed conflict 2021-01-14 14:23:44 +01:00
Michele Artini ac91e495fc Added eventId to ShortEventMessage 2021-01-14 13:20:35 +01:00
Claudio Atzori 80cf55ef2e [Broker] fixed partitionEventsByOpendoarIds workflow parameter names 2021-01-13 16:24:30 +01:00
Claudio Atzori 41500669e2 [BIP! Scores integration] merged missing classes from bipFinder branch 2021-01-11 14:39:47 +01:00
Claudio Atzori 2a7a10809e [BIP! Scores integration] merged missing classes from bipFinder branch 2021-01-11 10:05:02 +01:00
Claudio Atzori 5bd999efe7 Merge pull request 'bipFinder_master_test' (#84) from bipFinder_master_test into master 2021-01-08 18:16:34 +01:00
Claudio Atzori d6686dd7cf merged from master 2021-01-08 18:16:12 +01:00
Claudio Atzori 34229970e6 [BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation 2021-01-08 16:29:17 +01:00
Claudio Atzori 1361c9eb0c [BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation 2021-01-07 10:07:30 +01:00
Claudio Atzori ab2fe9266a [DOIBoost] minor fixes in workflow definition 2021-01-05 10:26:39 +01:00
Claudio Atzori 7c722f3fdc [DOIBoost] fixed typo 2021-01-05 10:25:54 +01:00
Claudio Atzori 8879704ba0 [DOIBoost] configurable ES server url and index name in crossref importer 2021-01-05 10:00:13 +01:00
Claudio Atzori 26e9d55c13 code formatting 2021-01-05 09:59:26 +01:00
Sandro La Bruzzo 7834a35768 avoid to save intermediate dataset before generation of Sequence file 2021-01-04 17:54:57 +01:00
Sandro La Bruzzo e79445a8b4 minor fix for claudio polemica 2021-01-04 17:39:25 +01:00
Sandro La Bruzzo 8765020b85 minor fix 2021-01-04 17:37:08 +01:00
Sandro La Bruzzo b0dc92786f defined a single oozie workflow for the generation of doiboost 2021-01-04 17:01:35 +01:00
Claudio Atzori 7185158942 ignore missing properties 2020-12-29 11:06:28 +01:00
Claudio Atzori 28460c2cd1 using com.fasterxml.jackson.databind.ObjectMapper instead of org.codehaus.jackson.map.ObjectMapper 2020-12-23 16:59:52 +01:00
Claudio Atzori 60649ac7d2 swapped expected and actual in tests, updated expected number of authors 2020-12-23 12:26:04 +01:00
Claudio Atzori 723b01f9e9 trivial: the less magic numbers and values around, the better 2020-12-23 12:22:48 +01:00
Claudio Atzori 6848d0c3d7 trivial: avoid duplicated code 2020-12-23 12:21:58 +01:00
Claudio Atzori d8b5f43a7e code formatting 2020-12-22 14:59:03 +01:00
Claudio Atzori 7bfc35df5e Merge pull request 'Changed typo in script names' (#82) from antonis.lempesis/dnet-hadoop:master into master
no need to! :)
2020-12-22 12:36:21 +01:00
Antonis Lempesis be5969a8c2 Changed typo in script names 2020-12-22 13:33:32 +02:00
miconis 794e22b09c bug fix in the authormerge: now authors with higher size have priority, normalization of author name fixed 2020-12-21 17:51:42 +01:00
miconis 1e1aab83e3 implementation of the raw wf for openorgs: still not complete, some functionalities are missing 2020-12-21 11:58:21 +01:00
Claudio Atzori 6cb0dc3f43 extended OCRID cleaning procedure 2020-12-21 11:40:17 +01:00
Claudio Atzori 573a8a3272 Merge pull request 'Changed typo in script names' (#81) from antonis.lempesis/dnet-hadoop:master into master
ok! LGTM
2020-12-18 17:44:26 +01:00
Antonis Lempesis 2a074c3b2b Changed typo in script names 2020-12-18 18:40:48 +02:00
Claudio Atzori 47270d9af5 lenient mock can be lenient 2020-12-18 15:38:59 +01:00