Michele Artini
|
0fda2c3a30
|
some tests on db records
|
2020-03-25 09:43:58 +01:00 |
miconis
|
02320de371
|
minor changes
|
2020-03-24 17:43:51 +01:00 |
miconis
|
8e8b5e8f30
|
roots wf merged in scan wf
|
2020-03-24 17:40:58 +01:00 |
Claudio Atzori
|
51ff68db66
|
Merge branch 'dedupTest' of https://code-repo.d4science.org/D-Net/dnet-hadoop into dedupTest
|
2020-03-24 11:18:19 +01:00 |
Claudio Atzori
|
1e869e7bed
|
using method available from currently used library
|
2020-03-24 11:17:44 +01:00 |
miconis
|
f0d72b76a8
|
package structure fixed
|
2020-03-24 10:51:40 +01:00 |
Claudio Atzori
|
aaedbb1b8b
|
WIP: dedup workflow, stage 2
|
2020-03-24 09:59:28 +01:00 |
Michele Artini
|
e3760c7f39
|
fix a bug with organization countries
|
2020-03-24 08:43:56 +01:00 |
Claudio Atzori
|
8b0ba3d76a
|
posprocessing script correctly run as hive2 action
|
2020-03-23 17:40:39 +01:00 |
miconis
|
93e2291291
|
minor changes
|
2020-03-23 17:17:56 +01:00 |
miconis
|
f7890a90df
|
implementation of the mechanism that checks the existance of a mergerel file
|
2020-03-23 17:13:30 +01:00 |
miconis
|
c20e179f5a
|
structure of the workflows updated
|
2020-03-23 11:43:49 +01:00 |
Claudio Atzori
|
658d40ccbe
|
WIP trying to use hive2 actions
|
2020-03-23 11:14:54 +01:00 |
Claudio Atzori
|
ecb64e4998
|
Merge branch 'migration_wfs_regular_all_steps'
|
2020-03-23 08:57:01 +01:00 |
Michele Artini
|
15160032bd
|
fixed a bug setting some organization fields
|
2020-03-23 08:39:14 +01:00 |
Claudio Atzori
|
a4c52661a0
|
WIP: fixing dedup workflows
|
2020-03-20 19:17:24 +01:00 |
Claudio Atzori
|
6cb0a9bff0
|
dedup wf directory structure aligned with project commons
|
2020-03-20 16:48:14 +01:00 |
miconis
|
e16e644faf
|
implementation of the workflow for entity update and for relations update
|
2020-03-20 13:01:56 +01:00 |
przemek
|
638b78f96a
|
Merge remote-tracking branch 'origin/master' into przemyslawjacewicz_actionmanager_impl_prototype
|
2020-03-19 15:12:56 +01:00 |
miconis
|
6d879e2ee1
|
integration of the new AtomicAction class
|
2020-03-19 15:10:42 +01:00 |
miconis
|
6e0fb8efa0
|
minor changes
|
2020-03-19 15:08:03 +01:00 |
miconis
|
4e82a24af2
|
minor changes and implementation of the create connected components action
|
2020-03-19 15:01:07 +01:00 |
Claudio Atzori
|
36236dd1c1
|
action migration workflow produces eu.dnetlib.dhp.schema.action.AtomicAction(s)
|
2020-03-19 14:00:38 +01:00 |
Claudio Atzori
|
a0ab15a64c
|
need to stick on using guava:11.0.2 as it is the version used by the hadoop components (oozie client for sure). The last version (28.2-jre) breaks the oozie workflow submission
|
2020-03-19 13:58:58 +01:00 |
Sandro La Bruzzo
|
0594b92a6d
|
implemented relation with dataset
|
2020-03-19 11:11:07 +01:00 |
Claudio Atzori
|
1850a02ae4
|
added simpler, AtomicAction replacement, based on the dhp.Oaf model
|
2020-03-19 10:44:16 +01:00 |
miconis
|
679b5869e5
|
implementation of the lookup procedure to take dedup conf from the resource profiles
|
2020-03-18 17:41:56 +01:00 |
Claudio Atzori
|
abe8fb69a2
|
added global properties, moved postprocessing script inside the oozie_app directory
|
2020-03-18 15:43:54 +01:00 |
miconis
|
f32eae5ce9
|
implementation of the spark action for the simrel creation
|
2020-03-18 14:27:49 +01:00 |
Claudio Atzori
|
c7e0730720
|
compress the output produced by migration steps 1 and 2
|
2020-03-18 09:34:57 +01:00 |
Claudio Atzori
|
2f11e37602
|
fixed expansion of path variables
|
2020-03-17 19:41:07 +01:00 |
Claudio Atzori
|
2795b0b096
|
no need to mkdir a the all_entities file
|
2020-03-17 17:22:14 +01:00 |
Claudio Atzori
|
19746ad308
|
when reuseContent, reset ${workingPath}/all_entities
|
2020-03-17 17:17:06 +01:00 |
Claudio Atzori
|
2f0c85eeb3
|
updated parameters for regular_all_steps worfklow, introduced flag 'reuseContent'
|
2020-03-17 17:04:58 +01:00 |
Claudio Atzori
|
b8290b5851
|
updated parameters for regular_all_steps worfklow
|
2020-03-17 15:45:30 +01:00 |
Claudio Atzori
|
4706f24ec5
|
updated parameters for regular_all_steps worfklow
|
2020-03-17 15:23:54 +01:00 |
Claudio Atzori
|
aeb01fa353
|
reading from newline delimited json textfiles instead of sequence files
|
2020-03-17 11:57:24 +01:00 |
Claudio Atzori
|
af835f2f98
|
when migrating actionsets from DM cluster, populate the AtomicAction.targetValue when empty (dedup similarities)
|
2020-03-15 18:07:59 +01:00 |
Claudio Atzori
|
9c84e21b87
|
added workflow to migrate latest version of each actionset content from DM to OCEAN cluster, mapping the targetValues from the old protobuf data model to the dhp.OAF datamodel
|
2020-03-13 15:56:52 +01:00 |
Claudio Atzori
|
8fe7ae1482
|
xml formatting
|
2020-03-13 15:53:56 +01:00 |
Claudio Atzori
|
23a929177d
|
updates to the graph require this to be an actual class
|
2020-03-13 14:56:35 +01:00 |
Przemysław Jacewicz
|
d0c9b0cdd6
|
WIP promote job functions updated
|
2020-03-13 12:36:42 +01:00 |
Przemysław Jacewicz
|
8d9b3c5de2
|
WIP action payload mapping into OAF type moved, (local) graph table name enum created, tests fixed
|
2020-03-13 10:01:39 +01:00 |
Przemysław Jacewicz
|
5cc560c7e5
|
Removed unnecessary dependency on old OAF model
|
2020-03-13 09:57:46 +01:00 |
Sandro La Bruzzo
|
addaaa091f
|
migrate relation from RDD to Dataset
|
2020-03-13 09:13:20 +01:00 |
Przemysław Jacewicz
|
3f24593e51
|
WIP: promote job tests and test resources implementation snapshot
|
2020-03-11 17:06:29 +01:00 |
Przemysław Jacewicz
|
2e996d610f
|
WIP: promote job functions implementation snapshot
|
2020-03-11 17:02:57 +01:00 |
Przemysław Jacewicz
|
cc63cdc9e6
|
WIP: promote job implementation snapshot
|
2020-03-11 17:02:06 +01:00 |
Przemysław Jacewicz
|
69540f6f78
|
Serialization-safe supplier added
|
2020-03-11 16:59:05 +01:00 |
Przemysław Jacewicz
|
e6e214dab5
|
Oaf merge and get strategy added
|
2020-03-11 16:58:17 +01:00 |