Commit Graph

2574 Commits

Author SHA1 Message Date
Claudio Atzori f6ccd54d87 Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids 2021-04-29 10:10:01 +02:00
Claudio Atzori 91e7220f20 cleaned up workflow for actionset migration, adjusted dnet|cnr* dependency versions 2021-04-29 10:09:52 +02:00
Enrico Ottonello c537986b7c deleted folders with merged data immediately before merge phases 2021-04-28 11:25:25 +02:00
Sandro La Bruzzo 2129e9caa7 updated pangaea transformation to parse directly the xml 2021-04-28 10:21:03 +02:00
Claudio Atzori 5afa7d3e0c core utilities in dhp-common moved in external module dhp-schemas 2021-04-27 15:44:01 +02:00
Claudio Atzori ac77a245a3 Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids 2021-04-27 14:05:00 +02:00
Claudio Atzori f783e60ff7 cleanup 2021-04-27 14:04:50 +02:00
Sandro La Bruzzo 63c0303137 removed unused import, add log 2021-04-27 12:17:23 +02:00
Sandro La Bruzzo 74484d2823 bug fixing 2021-04-27 12:13:44 +02:00
Claudio Atzori 233d849f90 added dnet45-bootstrap-snapshot and dnet45-bootstrap-release repositories 2021-04-27 12:03:40 +02:00
Claudio Atzori fcd13f5350 Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids 2021-04-27 11:37:45 +02:00
Claudio Atzori 4028176559 enabled snapshots from dnet45-snapshots repository 2021-04-27 11:37:32 +02:00
Sandro La Bruzzo c74b03d59c Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids 2021-04-27 11:31:07 +02:00
Sandro La Bruzzo 7f8848ecdd added first implementation of Pangaea Mapping 2021-04-27 11:30:37 +02:00
Claudio Atzori 27ab8a704d adjusted poms to align with the external dhp-schema module 2021-04-27 10:12:27 +02:00
Claudio Atzori a7cf449b36 cleanup 2021-04-27 10:11:26 +02:00
Claudio Atzori 82de6fb634 dhp-schema moved to dedicated module https://code-repo.d4science.org/D-Net/dhp-schemas 2021-04-27 10:10:50 +02:00
Claudio Atzori fa42026590 fixed PersonCleaner extension functions 2021-04-27 10:10:06 +02:00
Claudio Atzori ef4bfd82e2 code formatting 2021-04-27 10:09:31 +02:00
Claudio Atzori faa8f6f4e2 Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids 2021-04-27 09:57:03 +02:00
miconis 6d5c14e030 assertions updated in entity merger test 2021-04-27 09:47:49 +02:00
Claudio Atzori c2bb03c8b5 depending on external dhp-schemas module 2021-04-23 17:57:35 +02:00
Claudio Atzori c25238480c making ODF record parsing namespace unaware (#6629) 2021-04-23 17:34:57 +02:00
miconis d0e3366c34 Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids 2021-04-22 11:45:19 +02:00
miconis 3c12eeadce bug fix in propagation of relations 2021-04-22 11:44:33 +02:00
Claudio Atzori e5abbec2ba [orcid] download of the lambda file defined in a script 2021-04-22 11:22:10 +02:00
Claudio Atzori 55964cbd81 [orcid] large oozie workflow cleanup; updated workflow for the orcidnodoi actionset creation 2021-04-22 10:18:09 +02:00
Claudio Atzori 8f309b72ff [dedup] using node names consistently across the workflow 2021-04-21 17:54:51 +02:00
Claudio Atzori 52244f813a merging from enrico.ottonello/dnet-hadoop:orcid-no-doi 2021-04-21 12:24:09 +02:00
Sandro La Bruzzo fd29307b84 updated workflow name 2021-04-21 09:21:41 +02:00
Claudio Atzori 815b9f4d56 [openorgs dedup] fixed workflow parameter declarations. Introduced support for resuming the execution from intermediate steps 2021-04-20 17:24:45 +02:00
Claudio Atzori d0d477cca3 code formatting 2021-04-20 12:50:34 +02:00
miconis 0393cdce42 addition of alternative names in export queries 2021-04-20 12:45:21 +02:00
miconis cadd0a5de8 modification of the queries for openorgs: they now consider also pending orgs 2021-04-20 12:06:56 +02:00
Sandro La Bruzzo e06c7f32f6 updated id figshare as described in #6377 2021-04-20 10:18:07 +02:00
Sandro La Bruzzo dbe0d0378e resolved ticket #6377 2021-04-20 09:44:44 +02:00
Sandro La Bruzzo 524e5f3092 Improved parallelization on transformation wf on hadoop 2021-04-19 15:17:25 +02:00
Sandro La Bruzzo cdfe01bbae improved parallelization on transformation job 2021-04-19 15:14:52 +02:00
Sandro La Bruzzo 3ae67b7a1d Merge remote-tracking branch 'origin/stable_ids' into stable_ids 2021-04-16 17:36:57 +02:00
Sandro La Bruzzo a16e5299f9 applied unique function on the final dataset 2021-04-16 17:36:48 +02:00
Claudio Atzori 45057440c1 code formatting 2021-04-16 17:28:25 +02:00
Enrico Ottonello 34ca792a55 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2021-04-16 17:18:46 +02:00
Enrico Ottonello 27068aacd1 wf to move orcid-no-doi dataset on the folder ready the import 2021-04-16 17:17:47 +02:00
miconis 7ad573d023 bug fix: changed join in propagaterelations without applying filter on the id 2021-04-16 16:40:42 +02:00
Sandro La Bruzzo 67085da305 fixed NPE 2021-04-16 11:05:58 +02:00
Sandro La Bruzzo 644aa8f40c Merge remote-tracking branch 'origin/stable_ids' into stable_ids 2021-04-16 09:14:26 +02:00
Sandro La Bruzzo 7d6a80e2f2 added new type on MAG mapping 2021-04-16 09:14:15 +02:00
Claudio Atzori 8704d32780 code formatting 2021-04-15 16:52:58 +02:00
Claudio Atzori ba4b4c74d8 do not make the identifier prefix depend on the Handle 2021-04-15 16:48:26 +02:00
Claudio Atzori 906d50563c Merge pull request 'properly invalidating impala metadata' (#105) from antonis.lempesis/dnet-hadoop:master into master
Reviewed-on: #105
2021-04-15 15:06:22 +02:00