Enrico Ottonello
|
7e1c987370
|
Merge branch 'doiboost' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doiboost
|
2020-05-08 14:49:50 +02:00 |
Enrico Ottonello
|
9d812788e4
|
added job to download from orcid the records modified after a fixed date, the info are taken from last_modified.csv on hdfs
|
2020-05-08 14:49:39 +02:00 |
Miriam Baglioni
|
9a29ab7508
|
got back to the readPath we have before
|
2020-05-08 13:08:56 +02:00 |
Miriam Baglioni
|
28556507e7
|
-
|
2020-05-08 12:54:52 +02:00 |
Claudio Atzori
|
b2192fdcdc
|
simplified reset_outputpath nodes across the workflows, applied common xml formatting
|
2020-05-08 12:33:31 +02:00 |
Miriam Baglioni
|
4c94231cad
|
merge with master fork
|
2020-05-08 12:25:57 +02:00 |
Miriam Baglioni
|
9b4c0d4b3a
|
-
|
2020-05-08 11:51:45 +02:00 |
Miriam Baglioni
|
53952707b6
|
modified test because of new step of data preparation. It now expects to find ResultCountrySet serialization nstead of DatasourceCountry
|
2020-05-08 11:49:19 +02:00 |
Claudio Atzori
|
62ea19f1d3
|
introduced mapping for ExternalReferences, made urls defined within an instance unique
|
2020-05-08 09:43:26 +02:00 |
Claudio Atzori
|
8c67073a07
|
force speculative execution to false
|
2020-05-08 09:42:21 +02:00 |
Miriam Baglioni
|
d6b9de9f46
|
Merge branch 'master' of https://code-repo.d4science.org/miriam.baglioni/dnet-hadoop
|
2020-05-07 18:22:59 +02:00 |
Miriam Baglioni
|
f95d288681
|
fixed swithch of parameters
|
2020-05-07 18:22:32 +02:00 |
Claudio Atzori
|
166aafd936
|
heavy cleanup
|
2020-05-07 18:22:26 +02:00 |
Michele Artini
|
ac0da5a7ee
|
Partial implementation of broker events
|
2020-05-07 12:31:26 +02:00 |
Miriam Baglioni
|
fb405275f7
|
merged with master
|
2020-05-07 11:48:21 +02:00 |
Miriam Baglioni
|
e124278934
|
-
|
2020-05-07 11:47:11 +02:00 |
Claudio Atzori
|
5111671e62
|
celanup
|
2020-05-07 11:47:00 +02:00 |
Miriam Baglioni
|
9f8855991c
|
changed Encorders.bean to Encoders.kryo
|
2020-05-07 11:44:35 +02:00 |
Miriam Baglioni
|
207b899d6d
|
merged with upstream
|
2020-05-07 11:43:53 +02:00 |
Claudio Atzori
|
e07feb4c5f
|
removed spurious file
|
2020-05-07 11:42:46 +02:00 |
Claudio Atzori
|
5b3f8a0e90
|
using Encoders.bean instead of kryo
|
2020-05-07 11:41:41 +02:00 |
Miriam Baglioni
|
182225becb
|
Merge branch 'master' of https://code-repo.d4science.org/miriam.baglioni/dnet-hadoop
|
2020-05-07 11:38:17 +02:00 |
Miriam Baglioni
|
5efae3acb9
|
new workflow for job3
|
2020-05-07 11:38:10 +02:00 |
Claudio Atzori
|
73243793b2
|
Dataset based implementation for SparkCountryPropagationJob3
|
2020-05-07 11:15:24 +02:00 |
Claudio Atzori
|
128c3bf1c8
|
restored Author bean with simple getter/setter, author pid addition moved into dedicated implementation SparkOrcidToResultFromSemRelJob3
|
2020-05-07 11:14:56 +02:00 |
Miriam Baglioni
|
b2fec32c87
|
new workflow for job3
|
2020-05-07 10:01:57 +02:00 |
Miriam Baglioni
|
29bc8c44b1
|
changes in the construction of new country set
|
2020-05-07 10:01:34 +02:00 |
Miriam Baglioni
|
55e825acd4
|
chenged the test according to changes in SparkCOuntryPropagationJob2
|
2020-05-07 10:01:00 +02:00 |
Miriam Baglioni
|
16193cf0ba
|
new workflow and parameter for country propagation
|
2020-05-07 09:59:58 +02:00 |
Miriam Baglioni
|
5a476c7a13
|
chenged the xquery for the cfhb table
|
2020-05-07 09:58:17 +02:00 |
Miriam Baglioni
|
42ad51577a
|
new implementation with one more serialization step
|
2020-05-07 09:57:49 +02:00 |
Claudio Atzori
|
17860d3ab6
|
general changes in the RAW graph mapping: missing collectedfrom/hostedby causes records to be skipped; factored out most of the constants in ModelConstants class (dhp-schemas)
|
2020-05-06 13:20:02 +02:00 |
Claudio Atzori
|
fdfecc9578
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-05-06 11:28:01 +02:00 |
Claudio Atzori
|
c79e2f5977
|
drop workingPath before starting the dedup workflow
|
2020-05-06 11:27:44 +02:00 |
Michele Artini
|
8f30a09d84
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-05-05 17:12:22 +02:00 |
Michele Artini
|
ccc609f909
|
new module for the production of broker events
|
2020-05-05 17:09:00 +02:00 |
Miriam Baglioni
|
dd2e698a72
|
added a sequentialization step on the spark job. Addedd new parameter
|
2020-05-05 17:03:43 +02:00 |
Claudio Atzori
|
0825321d0b
|
improved unit tests in dhp-aggregation
|
2020-05-05 12:39:04 +02:00 |
Miriam Baglioni
|
252b219dd5
|
chanced the name of some properties
|
2020-05-05 10:03:32 +02:00 |
Claudio Atzori
|
4a8487165c
|
using long param names in wf definition
|
2020-05-04 19:19:29 +02:00 |
Claudio Atzori
|
a2fc37df5f
|
adjusted parameters
|
2020-05-04 19:18:59 +02:00 |
Claudio Atzori
|
f1b7e14036
|
code formatting
|
2020-05-04 19:18:34 +02:00 |
Claudio Atzori
|
405f495d54
|
code formatting
|
2020-05-04 19:18:12 +02:00 |
Claudio Atzori
|
c54d7ca18c
|
example measures in serialization test
|
2020-05-04 17:02:40 +02:00 |
Claudio Atzori
|
11938dac5e
|
this commit adds: validated/validationDate to relationships; measure type and simple unit test to indicate the relative serialization
|
2020-05-04 16:47:07 +02:00 |
Claudio Atzori
|
24d8d097b6
|
sync with master branch
|
2020-05-04 16:44:13 +02:00 |
Claudio Atzori
|
de5fbe325c
|
bits of javadoc
|
2020-05-04 16:00:48 +02:00 |
Miriam Baglioni
|
78578c3ccf
|
fixed wrong trnasition name in workflow
|
2020-05-04 15:46:24 +02:00 |
Miriam Baglioni
|
cc7d9b6b19
|
merge upstream
|
2020-05-04 13:59:09 +02:00 |
Miriam Baglioni
|
3957c815b9
|
changed the name of some parameters
|
2020-05-04 13:58:52 +02:00 |