Commit Graph

2321 Commits

Author SHA1 Message Date
Claudio Atzori ef612105ba [broker] added coalesce(1) on the stats dataset before storing it on postgres 2021-07-09 15:47:06 +02:00
Miriam Baglioni c64c0a0743 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2021-06-29 17:59:40 +02:00
Miriam Baglioni 8abdd9bad2 added step of normalization for the doi 2021-06-29 17:59:37 +02:00
Miriam Baglioni e1cd2e406e added class to test the normalization of dois 2021-06-29 17:55:03 +02:00
Miriam Baglioni 29828bc273 added resource to test the normalization of doi during the import of MAG 2021-06-29 17:54:14 +02:00
Miriam Baglioni 0f402a44fb slight modification of the resource to accomodate also doi normalization tests 2021-06-29 17:53:49 +02:00
Miriam Baglioni 2aa565ee6c added close command for the SparkContext 2021-06-29 17:53:10 +02:00
Miriam Baglioni d5d21254a2 added tests for the normalization of the dois 2021-06-29 17:50:07 +02:00
Miriam Baglioni 5e2f330239 added tests for the normalization of the dois 2021-06-29 17:49:39 +02:00
Miriam Baglioni 8320ad2248 added tests for the normalization of the dois 2021-06-29 17:49:11 +02:00
Miriam Baglioni 011e629df5 there is no more the need to lowerCase the doi since the normalized doi is saved in the first phase 2021-06-29 17:48:42 +02:00
Miriam Baglioni 22ae8a81c2 added the normalization step to the doi 2021-06-29 17:45:24 +02:00
Miriam Baglioni 779015e4a9 added the normalization step to the doi from crossref 2021-06-29 14:56:58 +02:00
Miriam Baglioni 3dd5701948 added the normalization step to the doi from crossref 2021-06-29 12:10:27 +02:00
Miriam Baglioni a5c1c0e90a added the normalization step to the doi from orcid before returning it 2021-06-29 12:03:54 +02:00
Miriam Baglioni dc5ed6f563 Added method to normalize doi values (lower case, remove all preceeding 10., filtering out doi not starting with 10.) 2021-06-29 12:03:13 +02:00
Miriam Baglioni 7e13817262 Aggiornare 'dhp-workflows/dhp-enrichment/src/main/java/eu/dnetlib/dhp/resulttoorganizationfrominstrepo/PrepareResultInstRepoAssociation.java'
fixed a problem in sql query
2021-06-24 15:32:21 +02:00
Miriam Baglioni 59c36eb185 check if pid is null (to avoid NullPointerException) 2021-06-21 10:41:47 +02:00
Claudio Atzori 6b8c357381 removed extra whitespace at the end of the file 2021-06-18 16:08:45 +02:00
Claudio Atzori c0d2b62e46 [doiboost] added missing implicit Encoder 2021-06-18 15:57:41 +02:00
Claudio Atzori a3948c1f6e cleanup old doiboost workflows 2021-06-18 15:14:08 +02:00
Claudio Atzori fddbc8364e Merge branch 'alessia.bardi-datepicker' 2021-06-17 09:24:46 +02:00
Alessia Bardi 6208b04f1d smarter DatePicker for ISO dates on dateofacceptance 2021-06-16 14:56:26 +02:00
Sandro La Bruzzo 9ca438d9b1 imported from branch stable_ids generation of Actionset datacite 2021-06-10 14:59:45 +02:00
Sandro La Bruzzo 42ff7a5665 some fix to the pom to compile scala 2021-06-10 14:31:06 +02:00
Sandro La Bruzzo ebe6aa6d38 implemented datacite transformation also on master 2021-06-10 10:52:36 +02:00
Claudio Atzori a4cfabdbc6 Merge pull request 'master' (#111) from antonis.lempesis/dnet-hadoop:master into master
Reviewed-on: #111
2021-05-28 14:09:12 +02:00
Claudio Atzori 338327171d integrating pull #109, H2020Classification 2021-05-27 11:57:01 +02:00
Claudio Atzori 6cbda49112 more pervasive use of constants from ModelConstants, especially for ORCID 2021-05-26 18:13:04 +02:00
Miriam Baglioni abd88f663d changed test resource to mirror change in the input file 2021-05-21 15:20:47 +02:00
Miriam Baglioni c844877de2 changed workflow flow to possibly parallelize also the programme and project preparation steps 2021-05-21 14:41:57 +02:00
Miriam Baglioni 073d76864d refactoring 2021-05-21 14:41:03 +02:00
Miriam Baglioni 4c8b4a774c removed not needed code 2021-05-21 14:40:07 +02:00
Miriam Baglioni 53b9d87fec new prepareProgramme according to the new file 2021-05-21 11:49:31 +02:00
Miriam Baglioni 1ee8f13580 refactoring and added "left" as join type to be 100% sure to get the whole set of projects 2021-05-21 11:49:05 +02:00
Miriam Baglioni e07c3ba089 due to change in the input file the filtering step is no more needed 2021-05-21 11:47:43 +02:00
Miriam Baglioni 54f6e2f693 changed to get the needed information to build the action set as parallel jobs 2021-05-21 11:47:00 +02:00
Miriam Baglioni 7180505519 removed non needed variable 2021-05-21 11:46:13 +02:00
Miriam Baglioni 2eb1a8b344 changed because the input file changed 2021-05-21 11:40:20 +02:00
Miriam Baglioni 9610224671 added param to workflow property 2021-05-20 18:21:12 +02:00
Miriam Baglioni aa45b4df9b - 2021-05-20 15:57:40 +02:00
Miriam Baglioni 052c837843 - 2021-05-20 15:54:44 +02:00
Claudio Atzori ea9b00ce56 adjusted test 2021-05-20 15:31:42 +02:00
Claudio Atzori 2e70aa43f0 Merge pull request 'H2020Classification fix and possibility to add datasources in blacklist for propagation of result to organization' (#108) from miriam.baglioni/dnet-hadoop:master into master
Reviewed-on: #108

The changes look ok, but please drop a comment to describe how the parameters should be changed from the workflow caller for both workflows
* H2020Classification
* propagation of result to organization
2021-05-20 15:25:05 +02:00
Claudio Atzori b572f56763 Merge branch 'master' into master 2021-05-20 15:22:35 +02:00
Claudio Atzori 2578b7fbb3 code formatting 2021-05-20 14:59:02 +02:00
Miriam Baglioni dc0ad8d2e0 fixed issue related to change in the file name downloaded. Added sheet name as parameter and also a check if the name should change 2021-05-20 14:53:53 +02:00
Claudio Atzori aef2977ad0 fixes #6701: xpath for titles to support both datacite and Guidelines v4 mapping 2021-05-20 14:40:22 +02:00
Miriam Baglioni 02b80cf24f resolved conflicts 2021-05-20 10:59:39 +02:00
Claudio Atzori 239d0f0a9a ROR actionset import workflow backported from branch stable_ids 2021-05-18 16:12:11 +02:00