Sandro La Bruzzo
|
98b9498b57
|
Removed old messaging system not quite used from collection and Transformation workflow
code refactor
|
2021-01-28 09:51:17 +01:00 |
Sandro La Bruzzo
|
184e7b3856
|
Implemented new Transformation using spark
|
2021-01-27 15:43:08 +01:00 |
Sandro La Bruzzo
|
a54848a59c
|
Moved Vocabulary stuff to common module
|
2021-01-25 15:43:04 +01:00 |
Sandro La Bruzzo
|
ffb092b8d3
|
removed duplicate code HttpConnector.java
|
2021-01-25 15:05:37 +01:00 |
Sandro La Bruzzo
|
cda210a2ca
|
changed documentation since it didn't reflect the current status
|
2021-01-25 14:17:42 +01:00 |
Claudio Atzori
|
07a0ccfc96
|
[Cleaning] trying to avoid NPEs
|
2021-01-25 13:36:01 +01:00 |
Claudio Atzori
|
646dab7f68
|
trying to avoid NPEs
|
2021-01-22 18:24:34 +01:00 |
Claudio Atzori
|
34d653de41
|
[Cleaning] updated cleaning rule for DOIs
|
2021-01-22 14:16:33 +01:00 |
Michele Artini
|
f667e94a31
|
Merge pull request 'broker' (#88) from broker into master
|
2021-01-14 14:48:13 +01:00 |
Michele Artini
|
cfbcdc95bc
|
fixed a wf param
|
2021-01-14 14:45:23 +01:00 |
Michele Artini
|
69ba3203c0
|
fixed a conflict
|
2021-01-14 14:43:25 +01:00 |
Michele Artini
|
fafb5b2e08
|
Merge branch 'broker' of code-repo.d4science.org:D-Net/dnet-hadoop into broker
|
2021-01-14 14:32:42 +01:00 |
Michele Artini
|
b230d44411
|
fixed conflict
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
b9d90e95b8
|
Added eventId to ShortEventMessage
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
64b0b0bfb3
|
fixed a bug with invalid subject topic
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
e3e0ab1de1
|
fixed a problem with join
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
26a941315a
|
openaireId
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
6f4d1a37f0
|
ES wf properties
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
1391341d06
|
mkdir of output dir
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
3c9cbd19f3
|
whitelist of topics
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
467aa77279
|
workingDir and outputDir
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
10f3f7eca7
|
workingDir and outputDir
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
ff41a7b3a4
|
gzipped output
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
223fa660cb
|
fixed conflict
|
2021-01-14 14:23:44 +01:00 |
Michele Artini
|
ac91e495fc
|
Added eventId to ShortEventMessage
|
2021-01-14 13:20:35 +01:00 |
Claudio Atzori
|
80cf55ef2e
|
[Broker] fixed partitionEventsByOpendoarIds workflow parameter names
|
2021-01-13 16:24:30 +01:00 |
Claudio Atzori
|
41500669e2
|
[BIP! Scores integration] merged missing classes from bipFinder branch
|
2021-01-11 14:39:47 +01:00 |
Claudio Atzori
|
2a7a10809e
|
[BIP! Scores integration] merged missing classes from bipFinder branch
|
2021-01-11 10:05:02 +01:00 |
Claudio Atzori
|
5bd999efe7
|
Merge pull request 'bipFinder_master_test' (#84) from bipFinder_master_test into master
|
2021-01-08 18:16:34 +01:00 |
Claudio Atzori
|
d6686dd7cf
|
merged from master
|
2021-01-08 18:16:12 +01:00 |
Claudio Atzori
|
34229970e6
|
[BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation
|
2021-01-08 16:29:17 +01:00 |
Claudio Atzori
|
1361c9eb0c
|
[BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation
|
2021-01-07 10:07:30 +01:00 |
Claudio Atzori
|
ab2fe9266a
|
[DOIBoost] minor fixes in workflow definition
|
2021-01-05 10:26:39 +01:00 |
Claudio Atzori
|
7c722f3fdc
|
[DOIBoost] fixed typo
|
2021-01-05 10:25:54 +01:00 |
Claudio Atzori
|
8879704ba0
|
[DOIBoost] configurable ES server url and index name in crossref importer
|
2021-01-05 10:00:13 +01:00 |
Claudio Atzori
|
26e9d55c13
|
code formatting
|
2021-01-05 09:59:26 +01:00 |
Sandro La Bruzzo
|
7834a35768
|
avoid to save intermediate dataset before generation of Sequence file
|
2021-01-04 17:54:57 +01:00 |
Sandro La Bruzzo
|
e79445a8b4
|
minor fix for claudio polemica
|
2021-01-04 17:39:25 +01:00 |
Sandro La Bruzzo
|
8765020b85
|
minor fix
|
2021-01-04 17:37:08 +01:00 |
Sandro La Bruzzo
|
b0dc92786f
|
defined a single oozie workflow for the generation of doiboost
|
2021-01-04 17:01:35 +01:00 |
Claudio Atzori
|
7185158942
|
ignore missing properties
|
2020-12-29 11:06:28 +01:00 |
Claudio Atzori
|
28460c2cd1
|
using com.fasterxml.jackson.databind.ObjectMapper instead of org.codehaus.jackson.map.ObjectMapper
|
2020-12-23 16:59:52 +01:00 |
Claudio Atzori
|
60649ac7d2
|
swapped expected and actual in tests, updated expected number of authors
|
2020-12-23 12:26:04 +01:00 |
Claudio Atzori
|
723b01f9e9
|
trivial: the less magic numbers and values around, the better
|
2020-12-23 12:22:48 +01:00 |
Claudio Atzori
|
6848d0c3d7
|
trivial: avoid duplicated code
|
2020-12-23 12:21:58 +01:00 |
Claudio Atzori
|
d8b5f43a7e
|
code formatting
|
2020-12-22 14:59:03 +01:00 |
Claudio Atzori
|
7bfc35df5e
|
Merge pull request 'Changed typo in script names' (#82) from antonis.lempesis/dnet-hadoop:master into master
no need to! :)
|
2020-12-22 12:36:21 +01:00 |
Antonis Lempesis
|
be5969a8c2
|
Changed typo in script names
|
2020-12-22 13:33:32 +02:00 |
miconis
|
794e22b09c
|
bug fix in the authormerge: now authors with higher size have priority, normalization of author name fixed
|
2020-12-21 17:51:42 +01:00 |
Claudio Atzori
|
6cb0dc3f43
|
extended OCRID cleaning procedure
|
2020-12-21 11:40:17 +01:00 |