Miriam Baglioni
cf758f4f91
added normalization step for the doi
2021-06-30 10:03:15 +02:00
Miriam Baglioni
801763a0fa
there is no more the need to lower case the doi since it is done in the first step. Also changed the creation of the id by using the factory
2021-06-29 19:07:23 +02:00
Miriam Baglioni
a74de1cda2
added normalization step to the doi
2021-06-29 18:51:11 +02:00
Miriam Baglioni
06074ea7d3
added normalization step to the doi
2021-06-29 18:46:08 +02:00
Miriam Baglioni
8b8ffe82dc
added step of normalization for the doi
2021-06-29 18:41:39 +02:00
Miriam Baglioni
50cc21d92e
Added method to normalize doi values (lower case, remove all preceeding 10., filtering out doi not starting with 10.)
2021-06-29 18:35:28 +02:00
Antonis Lempesis
87f14a3899
added the missing indicators files
2021-06-29 16:31:51 +03:00
Sandro La Bruzzo
db933ebd21
Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer
2021-06-29 14:16:12 +02:00
Sandro La Bruzzo
7e08655e5f
added relation dates in all scholexplorer Datasources
2021-06-29 12:02:03 +02:00
Sandro La Bruzzo
075055eaca
added relation dates in bio mapping
2021-06-29 10:33:09 +02:00
Sandro La Bruzzo
f36f92287d
implemented mapping from Crossref Event Data to Oaf
2021-06-29 10:21:23 +02:00
Antonis Lempesis
018c4eb52c
copied latest changes from old fork: indicators+monitor institutions
2021-06-28 23:46:52 +03:00
Sandro La Bruzzo
511ec14c63
implemented mapping from EBI and Scholix Resolved to OAF
2021-06-28 22:04:22 +02:00
Claudio Atzori
af42377d0e
HttpClient used in metadata collection retries on 502, 503, 504
2021-06-28 09:34:30 +02:00
Sandro La Bruzzo
ad50415167
Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer
2021-06-24 17:20:50 +02:00
Sandro La Bruzzo
80e15cc455
implemented mapping from uniprot, pdb and ebi links
2021-06-24 17:20:00 +02:00
Claudio Atzori
2e8fd2c531
cleanup
2021-06-23 14:38:24 +02:00
Claudio Atzori
4dc9ebf217
[raw_all] fixed unit test
2021-06-23 14:38:07 +02:00
Claudio Atzori
50fc5a64a0
[raw_all] Aggregator graph creation merges claims (updates) with the corresponding entity
2021-06-23 11:49:42 +02:00
Claudio Atzori
5edcc6832a
applying sonarLint suggestions
2021-06-23 09:53:29 +02:00
Sandro La Bruzzo
080a280bea
added pdb to Oaf Transformation
2021-06-21 16:23:59 +02:00
Sandro La Bruzzo
1dc0c59e20
merged fix thai dates from stable_ids
2021-06-21 10:39:46 +02:00
Sandro La Bruzzo
dc66cf615b
Merge branch 'stable_id_scholexplorer' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_id_scholexplorer
2021-06-21 09:38:33 +02:00
Sandro La Bruzzo
507e42102a
added pdb to oaf class
2021-06-21 09:36:40 +02:00
Sandro La Bruzzo
a167543637
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_id_scholexplorer
2021-06-21 09:14:11 +02:00
Sandro La Bruzzo
4fe7b75644
renamed packages
2021-06-18 16:41:24 +02:00
Sandro La Bruzzo
3990165d05
changed typologies of unresolved relation
2021-06-18 11:43:59 +02:00
Miriam Baglioni
180d671127
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-06-18 09:46:18 +02:00
Miriam Baglioni
13c96622c9
-
2021-06-18 09:45:16 +02:00
Miriam Baglioni
b486ae498f
added test and test resource to verify the generation of the date of acceptance from the input extracted from the dump
2021-06-18 09:43:32 +02:00
Miriam Baglioni
464c2ddde3
changed to split in two steps the generation of the crossref dataset
2021-06-18 09:42:31 +02:00
Miriam Baglioni
6aca0d8ebb
added kryo encoding for input files
2021-06-18 09:42:07 +02:00
Miriam Baglioni
3585e53da3
changed to split in two steps the generation of the crossref dataset
2021-06-18 09:41:23 +02:00
Claudio Atzori
41b551562e
applying PR#115 (DatePicker) on stable_ids
2021-06-17 09:33:50 +02:00
Sandro La Bruzzo
3100166d29
Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer
2021-06-16 16:22:16 +02:00
Claudio Atzori
74833d04f1
Merge branch 'pids_beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into stable_ids
2021-06-16 15:54:18 +02:00
Claudio Atzori
7243a40c88
code formatting
2021-06-16 15:03:03 +02:00
Sandro La Bruzzo
dfcf78cf24
removed wrong code
2021-06-16 14:57:42 +02:00
Sandro La Bruzzo
cc0f2b11fb
Implemented mapping from pubmed baseline to OAF
2021-06-16 14:56:24 +02:00
Miriam Baglioni
95885bcf12
forces executor Executor memory and driver executor memory to be 7G (trying to avoid OOM)
2021-06-16 10:17:52 +02:00
Miriam Baglioni
2550a73981
-
2021-06-16 10:04:41 +02:00
Miriam Baglioni
1c47c0d786
modified the number of executors trying to avoid OOM exception
2021-06-15 21:05:39 +02:00
Miriam Baglioni
7deac55138
added one option for resume from in the wf
2021-06-15 18:38:20 +02:00
Antonis Lempesis
f7c0b80e35
storing result_instance as parquet
2021-06-15 14:45:48 +03:00
Miriam Baglioni
66e7ef892f
changed the parameter name
2021-06-15 11:08:54 +02:00
Miriam Baglioni
4f47ad0891
no need to rename the folders, just write in overwrite mode, so I changed the name of the output folder
2021-06-15 09:28:31 +02:00
Miriam Baglioni
9f9dd00b94
refactoring
2021-06-15 09:24:46 +02:00
Miriam Baglioni
63d74ee379
refactoring
2021-06-15 09:24:11 +02:00
Miriam Baglioni
6ebc236657
added needed property: outputPath
2021-06-15 09:23:24 +02:00
Miriam Baglioni
f7379255b6
changed the workflow to extract info from the dump
2021-06-15 09:22:54 +02:00
Miriam Baglioni
d6e21bb6ea
creates the crossref dataset used for doiboost together with unpacking part from tar
2021-06-14 17:27:19 +02:00
Miriam Baglioni
4da141bd7c
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-06-14 13:41:02 +02:00
Miriam Baglioni
ce0cfd79e0
creates the crossref dataset used for doiboost
2021-06-14 13:40:19 +02:00
Miriam Baglioni
93efe4de82
split the construction of crossref dataset in two parts. This one just unpacks the tar entries
2021-06-14 13:39:40 +02:00
Michele Artini
ada063ce70
fixed a problem with empty mdstore list (2)
2021-06-14 12:04:47 +02:00
Michele Artini
83132ee99a
fixed a problem with empty mdstore list
2021-06-14 11:57:00 +02:00
Miriam Baglioni
cf360d7c97
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-06-14 10:19:49 +02:00
Miriam Baglioni
8873e6b6d1
workflow and parameter
2021-06-14 10:15:57 +02:00
Miriam Baglioni
0f1acdf6b6
workflow and parameter
2021-06-14 10:08:55 +02:00
Sandro La Bruzzo
aeb8132627
Merged branch stable_ids
2021-06-14 10:07:29 +02:00
Sandro La Bruzzo
efbea1e01a
minor fix
2021-06-14 09:45:14 +02:00
Miriam Baglioni
75780fc636
extraction of the tar for the dump of crossref, and creation of the dataset
2021-06-14 09:45:07 +02:00
Claudio Atzori
2039bb9f5f
orcid / orcid_pending cleaning backported from master branch
2021-06-14 09:40:50 +02:00
Claudio Atzori
dd19c4ac5a
Merge pull request 'import_new_mdstores' ( #112 ) from import_new_mdstores into stable_ids
...
Reviewed-on: #112
2021-06-14 09:23:55 +02:00
Claudio Atzori
e9e86a237d
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-06-11 17:00:02 +02:00
Claudio Atzori
a900bfb874
delegating the date parsing to https://github.com/sisyphsu/dateparser
2021-06-11 16:53:01 +02:00
Sandro La Bruzzo
dd997c49e0
fix wrong relation id
...
fix date thai ticket #6791
2021-06-10 14:47:18 +02:00
Antonis Lempesis
d413b24611
added instances, orgs for monitor, totalcost for projects, apcs
2021-06-10 02:35:46 +03:00
Claudio Atzori
741077dbca
Merge pull request 'Fix in Affiliation Propagation' ( #113 ) from miriam.baglioni/dnet-hadoop:master into stable_ids
...
Reviewed-on: #113
2021-06-09 18:42:42 +02:00
Miriam Baglioni
32b0c27217
Aggiornare 'dhp-workflows/dhp-enrichment/src/main/java/eu/dnetlib/dhp/resulttoorganizationfrominstrepo/PrepareResultInstRepoAssociation.java'
...
fix in SQL query: while writing the blacklist constraint it used d.id to indicate the datasource id, but no alias for the datasource was defined. So I removed the alias
2021-06-09 18:36:11 +02:00
Sandro La Bruzzo
0d1f37302f
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_id_scholexplorer
2021-06-09 09:35:16 +02:00
Miriam Baglioni
dc07f1079b
added check in case the author set to be enriched is null
2021-06-08 12:06:10 +02:00
Miriam Baglioni
8d2e086e48
changes to avoid reassignment to val
2021-06-07 17:50:37 +02:00
Miriam Baglioni
f33521d338
Aggiornare 'dhp-workflows/dhp-doiboost/src/main/java/eu/dnetlib/doiboost/orcid/SparkConvertORCIDToOAF.scala'
...
to be able to replace the aboject assigned to author val has been replaced by var
2021-06-07 17:27:07 +02:00
Miriam Baglioni
bc12e9819e
Aggiornare 'dhp-workflows/dhp-doiboost/src/main/java/eu/dnetlib/doiboost/orcid/SparkConvertORCIDToOAF.scala'
...
The change is to fix the issue that arises when the same work appears more than once on the same ORCID profile. The change avoid to replicate the association doi -> author when the orcid id is already associated to the doi.
2021-06-07 16:37:01 +02:00
Sandro La Bruzzo
0cdb7ccdaa
added inverse relations to datacite mapping
2021-06-04 15:10:20 +02:00
Sandro La Bruzzo
5b724d9972
added relations to datacite mapping
2021-06-04 10:14:22 +02:00
Sandro La Bruzzo
e57294ac99
implemented changes on PUBMed dataflow
2021-06-03 10:52:09 +02:00
Michele Artini
ede2749822
orcid pid type
2021-06-01 12:42:43 +02:00
Michele Artini
f0fbfdcfae
Merge branch 'stable_ids' into import_new_mdstores
2021-06-01 12:03:00 +02:00
Michele Artini
e950750262
add nodes to import hdfs mdstores
2021-06-01 10:48:50 +02:00
Michele Artini
03a510859a
removed coalesce(1)
2021-05-31 14:10:51 +02:00
Michele Artini
e9f2b6037c
patch of mdstore records
2021-05-31 11:36:26 +02:00
Sandro La Bruzzo
02ef46535f
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-05-31 09:50:15 +02:00
Sandro La Bruzzo
aeadc5a366
updated wf Datacite Import to retrieve the block size as parameter
2021-05-31 09:49:53 +02:00
Claudio Atzori
96238152cb
added serialization for alternateIdentifiers and pids within each record instance
2021-05-28 16:57:30 +02:00
Michele Artini
ad56a44fda
save as gzipped sequence file
2021-05-28 14:45:39 +02:00
Claudio Atzori
83722ebc47
pull #111 replied on stable_ids
2021-05-28 14:11:46 +02:00
Claudio Atzori
6e3a4e9237
updated test expectations
2021-05-28 09:37:50 +02:00
Michele Artini
4fa5671d16
first implementation of Hdfs Mdstores Importer
2021-05-27 16:22:07 +02:00
Claudio Atzori
d512062b58
integrating pull #109 , H2020Classification
2021-05-27 12:22:47 +02:00
Claudio Atzori
5e4b91d9ef
more pervasive use of constants from ModelConstants, especially for ORCID
2021-05-26 18:20:23 +02:00
Sandro La Bruzzo
bced804151
updated wf Datacite Import to retrieve the block size as parameter
2021-05-26 17:06:50 +02:00
Miriam Baglioni
abd88f663d
changed test resource to mirror change in the input file
2021-05-21 15:20:47 +02:00
Miriam Baglioni
c844877de2
changed workflow flow to possibly parallelize also the programme and project preparation steps
2021-05-21 14:41:57 +02:00
Miriam Baglioni
073d76864d
refactoring
2021-05-21 14:41:03 +02:00
Miriam Baglioni
4c8b4a774c
removed not needed code
2021-05-21 14:40:07 +02:00
Miriam Baglioni
53b9d87fec
new prepareProgramme according to the new file
2021-05-21 11:49:31 +02:00
Miriam Baglioni
1ee8f13580
refactoring and added "left" as join type to be 100% sure to get the whole set of projects
2021-05-21 11:49:05 +02:00
Miriam Baglioni
e07c3ba089
due to change in the input file the filtering step is no more needed
2021-05-21 11:47:43 +02:00
Miriam Baglioni
54f6e2f693
changed to get the needed information to build the action set as parallel jobs
2021-05-21 11:47:00 +02:00
Miriam Baglioni
7180505519
removed non needed variable
2021-05-21 11:46:13 +02:00
Miriam Baglioni
2eb1a8b344
changed because the input file changed
2021-05-21 11:40:20 +02:00
Claudio Atzori
9d725efdc1
reverted implementation of the mdstore client
2021-05-20 18:26:09 +02:00
Miriam Baglioni
9610224671
added param to workflow property
2021-05-20 18:21:12 +02:00
Claudio Atzori
863b56b6ce
using constants from ModelConstants
2021-05-20 16:23:58 +02:00
Claudio Atzori
ae5c28e54f
code formatting
2021-05-20 16:13:06 +02:00
Miriam Baglioni
aa45b4df9b
-
2021-05-20 15:57:40 +02:00
Miriam Baglioni
052c837843
-
2021-05-20 15:54:44 +02:00
Claudio Atzori
b695932ae4
integrated pull#108
2021-05-20 15:34:04 +02:00
Claudio Atzori
b572f56763
Merge branch 'master' into master
2021-05-20 15:22:35 +02:00
Claudio Atzori
2578b7fbb3
code formatting
2021-05-20 14:59:02 +02:00
Miriam Baglioni
dc0ad8d2e0
fixed issue related to change in the file name downloaded. Added sheet name as parameter and also a check if the name should change
2021-05-20 14:53:53 +02:00
Claudio Atzori
232dce83db
fixes #6701 : xpath for titles to support both datacite and Guidelines v4 mapping
2021-05-20 14:41:15 +02:00
Claudio Atzori
aef2977ad0
fixes #6701 : xpath for titles to support both datacite and Guidelines v4 mapping
2021-05-20 14:40:22 +02:00
Miriam Baglioni
02b80cf24f
resolved conflicts
2021-05-20 10:59:39 +02:00
Claudio Atzori
c4a23c2f4d
fix: preserving the old identifier among the originalIds in the doiboost construction process, trying to avoid UnsupportedOperationException while adding elements to the originalIds
2021-05-19 16:01:52 +02:00
Claudio Atzori
ba03f549d7
fix: preserving the old identifier among the originalIds in the doiboost construction process
2021-05-19 15:43:26 +02:00
Claudio Atzori
239d0f0a9a
ROR actionset import workflow backported from branch stable_ids
2021-05-18 16:12:11 +02:00
Antonis Lempesis
168edcbde3
added the final steps for the observatory promote wf and some cleanup
2021-05-18 15:23:20 +03:00
Michele Artini
e56ccec536
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-05-18 14:00:28 +02:00
Michele Artini
c1e20de7cf
fixed the deserialization of a json property
2021-05-18 14:00:14 +02:00
Claudio Atzori
a9f512103b
using constants from ModelConstants
2021-05-18 11:19:07 +02:00
Claudio Atzori
eeb8bcf075
using constants from ModelConstants
2021-05-18 11:10:07 +02:00
Claudio Atzori
2cbf15f4fb
using ModelConstants
2021-05-17 09:54:45 +02:00
Claudio Atzori
f19feceaf0
set the old identifier before switching to the new one
2021-05-14 12:53:40 +02:00
Claudio Atzori
1bd70fa2c6
preserving the old identifier among the originalIds in the doiboost construction process
2021-05-14 11:30:41 +02:00
Claudio Atzori
ca3f3a7687
using ModelConstants
2021-05-14 11:29:49 +02:00
Claudio Atzori
23b8883ab1
applied intellij code cleanup
2021-05-14 10:58:12 +02:00
Claudio Atzori
609eb711b3
IndexRecordTransformerTest for producing a record that can be manually submitted to solr
2021-05-13 16:13:28 +02:00
Claudio Atzori
1517bf7c92
IndexRecordTransformerTest for producing a record that can be manually submitted to solr
2021-05-13 16:11:22 +02:00
Sandro La Bruzzo
d9a0bbda7b
implemented new phase in doiboost to make the dataset Distinct by ID
2021-05-13 12:25:14 +02:00
Sandro La Bruzzo
6424cd9062
Added passing of the following parameters:
...
-varDataSourceId
-varOfficialName
in Each transformation Rule
2021-05-11 15:17:38 +02:00
Sandro La Bruzzo
073dcea2aa
Added passing of the following parameters:
...
-varDataSourceId
-varOfficialName
in Each transformation Rule
2021-05-11 15:05:58 +02:00
Claudio Atzori
d4c3476152
mapping datasource.journal only when an issn is available, null otherwhise
2021-05-11 11:08:54 +02:00
Claudio Atzori
da9d6f3887
mapping datasource.journal only when an issn is available, null otherwhise
2021-05-11 10:45:30 +02:00
Sandro La Bruzzo
54217d73ff
removed old parameters from oozie workflow
2021-05-11 09:59:02 +02:00
Claudio Atzori
d1cbee8413
imported methods from CleaningFunctions, defined in GraphCleaningFunctions
2021-05-10 16:43:39 +02:00
Claudio Atzori
3797543600
MDStoreManager model classes moved in dhp-schemas
2021-05-10 14:32:05 +02:00
Claudio Atzori
25254885b9
[ActionManagement] reduced number of xqueries used to access ActionSet info
2021-05-07 17:32:03 +02:00
Claudio Atzori
8a0de2fc18
[ActionManagement] reduced number of xqueries used to access ActionSet info
2021-05-07 17:31:32 +02:00
Sandro La Bruzzo
7dc824fc23
imported changes in stable_id into master
2021-05-07 12:53:50 +02:00
Michele Artini
d82071ba6c
originalId with prefix
2021-05-06 15:34:48 +02:00
Claudio Atzori
d4a30fabe3
clean up tests
2021-05-05 17:28:15 +02:00
Claudio Atzori
dccaf173cf
fixed mapping applied to ODF records. Added unit test to verify the mapping for OpenTrials
2021-05-05 16:36:15 +02:00
Claudio Atzori
8c96a82a03
fixed mapping applied to ODF records. Added unit test to verify the mapping for OpenTrials
2021-05-05 15:30:06 +02:00
Claudio Atzori
2e1eb96f9a
code formatting
2021-05-05 11:23:57 +02:00
Sandro La Bruzzo
1adfc41d23
merged manually changes on stable_id for doiboost into master
2021-05-05 10:23:32 +02:00
Claudio Atzori
fb930b84d3
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-05-04 18:06:30 +02:00
Claudio Atzori
923d19ea8e
mdstore read lock/unlock when bulk copying records from mongodb to hdfs
2021-05-04 18:06:21 +02:00
Sandro La Bruzzo
714b71bd21
updated pubmed
2021-05-04 14:54:12 +02:00
Claudio Atzori
ba86835951
using common constants from ModelConstants
2021-05-04 11:51:52 +02:00
Michele Artini
f4bd2b5619
recert file SparkDedupTest.java
2021-05-04 10:26:14 +02:00
Michele Artini
b4877da363
Merge branch 'stable_ids' into prepare_ror_actionset
2021-05-03 08:13:55 +02:00
Alessia Bardi
9a20057615
fixed query for organisations' pids
2021-04-29 15:23:39 +02:00
Michele Artini
6692128234
Merge branch 'stable_ids' into prepare_ror_actionset
2021-04-29 13:24:08 +02:00
Alessia Bardi
a801999e75
fixed query for organisations' pids
2021-04-29 12:18:42 +02:00
Michele Artini
a278d67175
parse input file
2021-04-29 11:34:47 +02:00
Claudio Atzori
f6ccd54d87
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-04-29 10:10:01 +02:00
Claudio Atzori
91e7220f20
cleaned up workflow for actionset migration, adjusted dnet|cnr* dependency versions
2021-04-29 10:09:52 +02:00
Michele Artini
f77ba34126
pid types
2021-04-29 09:50:05 +02:00
Michele Artini
7c5cd86927
annotations and tests
2021-04-29 09:29:19 +02:00
Michele Artini
b5cf505cc6
partial implementation of the ROR->actionset workflow
2021-04-28 16:00:24 +02:00
Enrico Ottonello
c537986b7c
deleted folders with merged data immediately before merge phases
2021-04-28 11:25:25 +02:00
Sandro La Bruzzo
2129e9caa7
updated pangaea transformation to parse directly the xml
2021-04-28 10:21:03 +02:00
Claudio Atzori
5afa7d3e0c
core utilities in dhp-common moved in external module dhp-schemas
2021-04-27 15:44:01 +02:00
Alessia Bardi
e6075bb917
updated json schema for results - added instances and accessright definition
2021-04-27 15:15:08 +02:00
Sandro La Bruzzo
63c0303137
removed unused import, add log
2021-04-27 12:17:23 +02:00
Sandro La Bruzzo
74484d2823
bug fixing
2021-04-27 12:13:44 +02:00
Sandro La Bruzzo
c74b03d59c
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-04-27 11:31:07 +02:00
Sandro La Bruzzo
7f8848ecdd
added first implementation of Pangaea Mapping
2021-04-27 11:30:37 +02:00
Claudio Atzori
27ab8a704d
adjusted poms to align with the external dhp-schema module
2021-04-27 10:12:27 +02:00
Claudio Atzori
a7cf449b36
cleanup
2021-04-27 10:11:26 +02:00
Claudio Atzori
fa42026590
fixed PersonCleaner extension functions
2021-04-27 10:10:06 +02:00
Claudio Atzori
ef4bfd82e2
code formatting
2021-04-27 10:09:31 +02:00
Claudio Atzori
faa8f6f4e2
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
2021-04-27 09:57:03 +02:00
miconis
6d5c14e030
assertions updated in entity merger test
2021-04-27 09:47:49 +02:00
Claudio Atzori
c2bb03c8b5
depending on external dhp-schemas module
2021-04-23 17:57:35 +02:00
Claudio Atzori
7ed107be53
depending on external dhp-schemas module
2021-04-23 17:52:36 +02:00
Claudio Atzori
c25238480c
making ODF record parsing namespace unaware ( #6629 )
2021-04-23 17:34:57 +02:00
Claudio Atzori
99cfb027fa
making ODF record parsing namespace unaware ( #6629 )
2021-04-23 17:09:36 +02:00
Miriam Baglioni
72e5aa3b42
refactoring
2021-04-23 12:10:30 +02:00
Miriam Baglioni
7d1b8b7f64
merge upstream
2021-04-23 11:55:49 +02:00
miconis
d0e3366c34
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
2021-04-22 11:45:19 +02:00
miconis
3c12eeadce
bug fix in propagation of relations
2021-04-22 11:44:33 +02:00
Claudio Atzori
e5abbec2ba
[orcid] download of the lambda file defined in a script
2021-04-22 11:22:10 +02:00
Claudio Atzori
55964cbd81
[orcid] large oozie workflow cleanup; updated workflow for the orcidnodoi actionset creation
2021-04-22 10:18:09 +02:00
Claudio Atzori
8f309b72ff
[dedup] using node names consistently across the workflow
2021-04-21 17:54:51 +02:00
Claudio Atzori
52244f813a
merging from enrico.ottonello/dnet-hadoop:orcid-no-doi
2021-04-21 12:24:09 +02:00
Sandro La Bruzzo
fd29307b84
updated workflow name
2021-04-21 09:21:41 +02:00
Claudio Atzori
815b9f4d56
[openorgs dedup] fixed workflow parameter declarations. Introduced support for resuming the execution from intermediate steps
2021-04-20 17:24:45 +02:00
Claudio Atzori
d0d477cca3
code formatting
2021-04-20 12:50:34 +02:00
miconis
0393cdce42
addition of alternative names in export queries
2021-04-20 12:45:21 +02:00
miconis
cadd0a5de8
modification of the queries for openorgs: they now consider also pending orgs
2021-04-20 12:06:56 +02:00
Sandro La Bruzzo
e06c7f32f6
updated id figshare as described in #6377
2021-04-20 10:18:07 +02:00
Sandro La Bruzzo
dbe0d0378e
resolved ticket #6377
2021-04-20 09:44:44 +02:00
Antonis Lempesis
625d993cd9
added step for observatory db
2021-04-20 02:31:06 +03:00
Antonis Lempesis
25d0512fbd
code cleanup
2021-04-20 01:43:23 +03:00
Sandro La Bruzzo
524e5f3092
Improved parallelization on transformation wf on hadoop
2021-04-19 15:17:25 +02:00
Sandro La Bruzzo
cdfe01bbae
improved parallelization on transformation job
2021-04-19 15:14:52 +02:00