Claudio Atzori
|
c4a23c2f4d
|
fix: preserving the old identifier among the originalIds in the doiboost construction process, trying to avoid UnsupportedOperationException while adding elements to the originalIds
|
2021-05-19 16:01:52 +02:00 |
Claudio Atzori
|
ba03f549d7
|
fix: preserving the old identifier among the originalIds in the doiboost construction process
|
2021-05-19 15:43:26 +02:00 |
Claudio Atzori
|
239d0f0a9a
|
ROR actionset import workflow backported from branch stable_ids
|
2021-05-18 16:12:11 +02:00 |
Michele Artini
|
e56ccec536
|
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
|
2021-05-18 14:00:28 +02:00 |
Michele Artini
|
c1e20de7cf
|
fixed the deserialization of a json property
|
2021-05-18 14:00:14 +02:00 |
Claudio Atzori
|
a9f512103b
|
using constants from ModelConstants
|
2021-05-18 11:19:07 +02:00 |
Claudio Atzori
|
eeb8bcf075
|
using constants from ModelConstants
|
2021-05-18 11:10:07 +02:00 |
Claudio Atzori
|
2cbf15f4fb
|
using ModelConstants
|
2021-05-17 09:54:45 +02:00 |
Claudio Atzori
|
f19feceaf0
|
set the old identifier before switching to the new one
|
2021-05-14 12:53:40 +02:00 |
Claudio Atzori
|
1bd70fa2c6
|
preserving the old identifier among the originalIds in the doiboost construction process
|
2021-05-14 11:30:41 +02:00 |
Claudio Atzori
|
ca3f3a7687
|
using ModelConstants
|
2021-05-14 11:29:49 +02:00 |
Claudio Atzori
|
23b8883ab1
|
applied intellij code cleanup
|
2021-05-14 10:58:12 +02:00 |
Claudio Atzori
|
609eb711b3
|
IndexRecordTransformerTest for producing a record that can be manually submitted to solr
|
2021-05-13 16:13:28 +02:00 |
Claudio Atzori
|
1517bf7c92
|
IndexRecordTransformerTest for producing a record that can be manually submitted to solr
|
2021-05-13 16:11:22 +02:00 |
Sandro La Bruzzo
|
d9a0bbda7b
|
implemented new phase in doiboost to make the dataset Distinct by ID
|
2021-05-13 12:25:14 +02:00 |
Sandro La Bruzzo
|
6424cd9062
|
Added passing of the following parameters:
-varDataSourceId
-varOfficialName
in Each transformation Rule
|
2021-05-11 15:17:38 +02:00 |
Sandro La Bruzzo
|
073dcea2aa
|
Added passing of the following parameters:
-varDataSourceId
-varOfficialName
in Each transformation Rule
|
2021-05-11 15:05:58 +02:00 |
Claudio Atzori
|
d4c3476152
|
mapping datasource.journal only when an issn is available, null otherwhise
|
2021-05-11 11:08:54 +02:00 |
Claudio Atzori
|
da9d6f3887
|
mapping datasource.journal only when an issn is available, null otherwhise
|
2021-05-11 10:45:30 +02:00 |
Sandro La Bruzzo
|
54217d73ff
|
removed old parameters from oozie workflow
|
2021-05-11 09:59:02 +02:00 |
Claudio Atzori
|
d1cbee8413
|
imported methods from CleaningFunctions, defined in GraphCleaningFunctions
|
2021-05-10 16:43:39 +02:00 |
Claudio Atzori
|
3797543600
|
MDStoreManager model classes moved in dhp-schemas
|
2021-05-10 14:32:05 +02:00 |
Claudio Atzori
|
25254885b9
|
[ActionManagement] reduced number of xqueries used to access ActionSet info
|
2021-05-07 17:32:03 +02:00 |
Claudio Atzori
|
8a0de2fc18
|
[ActionManagement] reduced number of xqueries used to access ActionSet info
|
2021-05-07 17:31:32 +02:00 |
Sandro La Bruzzo
|
7dc824fc23
|
imported changes in stable_id into master
|
2021-05-07 12:53:50 +02:00 |
Michele Artini
|
d82071ba6c
|
originalId with prefix
|
2021-05-06 15:34:48 +02:00 |
Claudio Atzori
|
d4a30fabe3
|
clean up tests
|
2021-05-05 17:28:15 +02:00 |
Claudio Atzori
|
dccaf173cf
|
fixed mapping applied to ODF records. Added unit test to verify the mapping for OpenTrials
|
2021-05-05 16:36:15 +02:00 |
Claudio Atzori
|
8c96a82a03
|
fixed mapping applied to ODF records. Added unit test to verify the mapping for OpenTrials
|
2021-05-05 15:30:06 +02:00 |
Claudio Atzori
|
2e1eb96f9a
|
code formatting
|
2021-05-05 11:23:57 +02:00 |
Sandro La Bruzzo
|
1adfc41d23
|
merged manually changes on stable_id for doiboost into master
|
2021-05-05 10:23:32 +02:00 |
Claudio Atzori
|
fb930b84d3
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-05-04 18:06:30 +02:00 |
Claudio Atzori
|
923d19ea8e
|
mdstore read lock/unlock when bulk copying records from mongodb to hdfs
|
2021-05-04 18:06:21 +02:00 |
Sandro La Bruzzo
|
714b71bd21
|
updated pubmed
|
2021-05-04 14:54:12 +02:00 |
Claudio Atzori
|
ba86835951
|
using common constants from ModelConstants
|
2021-05-04 11:51:52 +02:00 |
Michele Artini
|
f4bd2b5619
|
recert file SparkDedupTest.java
|
2021-05-04 10:26:14 +02:00 |
Michele Artini
|
b4877da363
|
Merge branch 'stable_ids' into prepare_ror_actionset
|
2021-05-03 08:13:55 +02:00 |
Alessia Bardi
|
9a20057615
|
fixed query for organisations' pids
|
2021-04-29 15:23:39 +02:00 |
Michele Artini
|
6692128234
|
Merge branch 'stable_ids' into prepare_ror_actionset
|
2021-04-29 13:24:08 +02:00 |
Alessia Bardi
|
a801999e75
|
fixed query for organisations' pids
|
2021-04-29 12:18:42 +02:00 |
Michele Artini
|
a278d67175
|
parse input file
|
2021-04-29 11:34:47 +02:00 |
Claudio Atzori
|
f6ccd54d87
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-04-29 10:10:01 +02:00 |
Claudio Atzori
|
91e7220f20
|
cleaned up workflow for actionset migration, adjusted dnet|cnr* dependency versions
|
2021-04-29 10:09:52 +02:00 |
Michele Artini
|
f77ba34126
|
pid types
|
2021-04-29 09:50:05 +02:00 |
Michele Artini
|
7c5cd86927
|
annotations and tests
|
2021-04-29 09:29:19 +02:00 |
Michele Artini
|
b5cf505cc6
|
partial implementation of the ROR->actionset workflow
|
2021-04-28 16:00:24 +02:00 |
Enrico Ottonello
|
c537986b7c
|
deleted folders with merged data immediately before merge phases
|
2021-04-28 11:25:25 +02:00 |
Sandro La Bruzzo
|
2129e9caa7
|
updated pangaea transformation to parse directly the xml
|
2021-04-28 10:21:03 +02:00 |
Claudio Atzori
|
5afa7d3e0c
|
core utilities in dhp-common moved in external module dhp-schemas
|
2021-04-27 15:44:01 +02:00 |
Alessia Bardi
|
e6075bb917
|
updated json schema for results - added instances and accessright definition
|
2021-04-27 15:15:08 +02:00 |
Sandro La Bruzzo
|
63c0303137
|
removed unused import, add log
|
2021-04-27 12:17:23 +02:00 |
Sandro La Bruzzo
|
74484d2823
|
bug fixing
|
2021-04-27 12:13:44 +02:00 |
Sandro La Bruzzo
|
c74b03d59c
|
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
|
2021-04-27 11:31:07 +02:00 |
Sandro La Bruzzo
|
7f8848ecdd
|
added first implementation of Pangaea Mapping
|
2021-04-27 11:30:37 +02:00 |
Claudio Atzori
|
27ab8a704d
|
adjusted poms to align with the external dhp-schema module
|
2021-04-27 10:12:27 +02:00 |
Claudio Atzori
|
a7cf449b36
|
cleanup
|
2021-04-27 10:11:26 +02:00 |
Claudio Atzori
|
fa42026590
|
fixed PersonCleaner extension functions
|
2021-04-27 10:10:06 +02:00 |
Claudio Atzori
|
ef4bfd82e2
|
code formatting
|
2021-04-27 10:09:31 +02:00 |
Claudio Atzori
|
faa8f6f4e2
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-04-27 09:57:03 +02:00 |
miconis
|
6d5c14e030
|
assertions updated in entity merger test
|
2021-04-27 09:47:49 +02:00 |
Claudio Atzori
|
c2bb03c8b5
|
depending on external dhp-schemas module
|
2021-04-23 17:57:35 +02:00 |
Claudio Atzori
|
7ed107be53
|
depending on external dhp-schemas module
|
2021-04-23 17:52:36 +02:00 |
Claudio Atzori
|
c25238480c
|
making ODF record parsing namespace unaware (#6629)
|
2021-04-23 17:34:57 +02:00 |
Claudio Atzori
|
99cfb027fa
|
making ODF record parsing namespace unaware (#6629)
|
2021-04-23 17:09:36 +02:00 |
Miriam Baglioni
|
72e5aa3b42
|
refactoring
|
2021-04-23 12:10:30 +02:00 |
Miriam Baglioni
|
7d1b8b7f64
|
merge upstream
|
2021-04-23 11:55:49 +02:00 |
miconis
|
d0e3366c34
|
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
|
2021-04-22 11:45:19 +02:00 |
miconis
|
3c12eeadce
|
bug fix in propagation of relations
|
2021-04-22 11:44:33 +02:00 |
Claudio Atzori
|
e5abbec2ba
|
[orcid] download of the lambda file defined in a script
|
2021-04-22 11:22:10 +02:00 |
Claudio Atzori
|
55964cbd81
|
[orcid] large oozie workflow cleanup; updated workflow for the orcidnodoi actionset creation
|
2021-04-22 10:18:09 +02:00 |
Claudio Atzori
|
8f309b72ff
|
[dedup] using node names consistently across the workflow
|
2021-04-21 17:54:51 +02:00 |
Claudio Atzori
|
52244f813a
|
merging from enrico.ottonello/dnet-hadoop:orcid-no-doi
|
2021-04-21 12:24:09 +02:00 |
Sandro La Bruzzo
|
fd29307b84
|
updated workflow name
|
2021-04-21 09:21:41 +02:00 |
Claudio Atzori
|
815b9f4d56
|
[openorgs dedup] fixed workflow parameter declarations. Introduced support for resuming the execution from intermediate steps
|
2021-04-20 17:24:45 +02:00 |
Claudio Atzori
|
d0d477cca3
|
code formatting
|
2021-04-20 12:50:34 +02:00 |
miconis
|
0393cdce42
|
addition of alternative names in export queries
|
2021-04-20 12:45:21 +02:00 |
miconis
|
cadd0a5de8
|
modification of the queries for openorgs: they now consider also pending orgs
|
2021-04-20 12:06:56 +02:00 |
Sandro La Bruzzo
|
e06c7f32f6
|
updated id figshare as described in #6377
|
2021-04-20 10:18:07 +02:00 |
Sandro La Bruzzo
|
dbe0d0378e
|
resolved ticket #6377
|
2021-04-20 09:44:44 +02:00 |
Sandro La Bruzzo
|
524e5f3092
|
Improved parallelization on transformation wf on hadoop
|
2021-04-19 15:17:25 +02:00 |
Sandro La Bruzzo
|
cdfe01bbae
|
improved parallelization on transformation job
|
2021-04-19 15:14:52 +02:00 |
Sandro La Bruzzo
|
3ae67b7a1d
|
Merge remote-tracking branch 'origin/stable_ids' into stable_ids
|
2021-04-16 17:36:57 +02:00 |
Sandro La Bruzzo
|
a16e5299f9
|
applied unique function on the final dataset
|
2021-04-16 17:36:48 +02:00 |
Claudio Atzori
|
45057440c1
|
code formatting
|
2021-04-16 17:28:25 +02:00 |
Enrico Ottonello
|
34ca792a55
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi
|
2021-04-16 17:18:46 +02:00 |
Enrico Ottonello
|
27068aacd1
|
wf to move orcid-no-doi dataset on the folder ready the import
|
2021-04-16 17:17:47 +02:00 |
miconis
|
7ad573d023
|
bug fix: changed join in propagaterelations without applying filter on the id
|
2021-04-16 16:40:42 +02:00 |
Sandro La Bruzzo
|
67085da305
|
fixed NPE
|
2021-04-16 11:05:58 +02:00 |
Sandro La Bruzzo
|
644aa8f40c
|
Merge remote-tracking branch 'origin/stable_ids' into stable_ids
|
2021-04-16 09:14:26 +02:00 |
Sandro La Bruzzo
|
7d6a80e2f2
|
added new type on MAG mapping
|
2021-04-16 09:14:15 +02:00 |
Claudio Atzori
|
906d50563c
|
Merge pull request 'properly invalidating impala metadata' (#105) from antonis.lempesis/dnet-hadoop:master into master
Reviewed-on: D-Net/dnet-hadoop#105
|
2021-04-15 15:06:22 +02:00 |
Claudio Atzori
|
3d58f95522
|
[stats update] properly invalidating impala metadata
|
2021-04-15 15:03:05 +02:00 |
Antonis Lempesis
|
03d36fadea
|
properly invalidating impala metadata
|
2021-04-15 13:34:22 +03:00 |
miconis
|
f64e57c112
|
refactoring of the id generation, sparkcreatemergerels collects entities to create root id after a join
|
2021-04-15 10:59:24 +02:00 |
miconis
|
176a5e493d
|
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
|
2021-04-14 18:06:34 +02:00 |
miconis
|
3525a8f504
|
id generation of representative record moved to the SparkCreateMergeRel job
|
2021-04-14 18:06:07 +02:00 |
Sandro La Bruzzo
|
3f77bfceb0
|
fixed test failure on jenkins
|
2021-04-14 10:03:01 +02:00 |
Claudio Atzori
|
3125cef545
|
code formatting
|
2021-04-14 09:11:54 +02:00 |
Sandro La Bruzzo
|
44a0064df6
|
Merge remote-tracking branch 'origin/stable_ids' into stable_ids
|
2021-04-13 17:48:12 +02:00 |
Sandro La Bruzzo
|
479abd10cb
|
Add into ORCID workflow a method that extracts orcid directly to the dump generated by Enrico
|
2021-04-13 17:47:43 +02:00 |