Claudio Atzori
ef612105ba
[broker] added coalesce(1) on the stats dataset before storing it on postgres
2021-07-09 15:47:06 +02:00
Miriam Baglioni
c64c0a0743
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
2021-06-29 17:59:40 +02:00
Miriam Baglioni
8abdd9bad2
added step of normalization for the doi
2021-06-29 17:59:37 +02:00
Miriam Baglioni
e1cd2e406e
added class to test the normalization of dois
2021-06-29 17:55:03 +02:00
Miriam Baglioni
29828bc273
added resource to test the normalization of doi during the import of MAG
2021-06-29 17:54:14 +02:00
Miriam Baglioni
0f402a44fb
slight modification of the resource to accomodate also doi normalization tests
2021-06-29 17:53:49 +02:00
Miriam Baglioni
2aa565ee6c
added close command for the SparkContext
2021-06-29 17:53:10 +02:00
Miriam Baglioni
d5d21254a2
added tests for the normalization of the dois
2021-06-29 17:50:07 +02:00
Miriam Baglioni
5e2f330239
added tests for the normalization of the dois
2021-06-29 17:49:39 +02:00
Miriam Baglioni
8320ad2248
added tests for the normalization of the dois
2021-06-29 17:49:11 +02:00
Miriam Baglioni
011e629df5
there is no more the need to lowerCase the doi since the normalized doi is saved in the first phase
2021-06-29 17:48:42 +02:00
Miriam Baglioni
22ae8a81c2
added the normalization step to the doi
2021-06-29 17:45:24 +02:00
Miriam Baglioni
779015e4a9
added the normalization step to the doi from crossref
2021-06-29 14:56:58 +02:00
Miriam Baglioni
3dd5701948
added the normalization step to the doi from crossref
2021-06-29 12:10:27 +02:00
Miriam Baglioni
a5c1c0e90a
added the normalization step to the doi from orcid before returning it
2021-06-29 12:03:54 +02:00
Miriam Baglioni
dc5ed6f563
Added method to normalize doi values (lower case, remove all preceeding 10., filtering out doi not starting with 10.)
2021-06-29 12:03:13 +02:00
Miriam Baglioni
7e13817262
Aggiornare 'dhp-workflows/dhp-enrichment/src/main/java/eu/dnetlib/dhp/resulttoorganizationfrominstrepo/PrepareResultInstRepoAssociation.java'
...
fixed a problem in sql query
2021-06-24 15:32:21 +02:00
Miriam Baglioni
59c36eb185
check if pid is null (to avoid NullPointerException)
2021-06-21 10:41:47 +02:00
Claudio Atzori
6b8c357381
removed extra whitespace at the end of the file
2021-06-18 16:08:45 +02:00
Claudio Atzori
c0d2b62e46
[doiboost] added missing implicit Encoder
2021-06-18 15:57:41 +02:00
Claudio Atzori
a3948c1f6e
cleanup old doiboost workflows
2021-06-18 15:14:08 +02:00
Claudio Atzori
fddbc8364e
Merge branch 'alessia.bardi-datepicker'
2021-06-17 09:24:46 +02:00
Alessia Bardi
6208b04f1d
smarter DatePicker for ISO dates on dateofacceptance
2021-06-16 14:56:26 +02:00
Sandro La Bruzzo
9ca438d9b1
imported from branch stable_ids generation of Actionset datacite
2021-06-10 14:59:45 +02:00
Sandro La Bruzzo
42ff7a5665
some fix to the pom to compile scala
2021-06-10 14:31:06 +02:00
Sandro La Bruzzo
ebe6aa6d38
implemented datacite transformation also on master
2021-06-10 10:52:36 +02:00
Claudio Atzori
a4cfabdbc6
Merge pull request 'master' ( #111 ) from antonis.lempesis/dnet-hadoop:master into master
...
Reviewed-on: D-Net/dnet-hadoop#111
2021-05-28 14:09:12 +02:00
Claudio Atzori
338327171d
integrating pull #109 , H2020Classification
2021-05-27 11:57:01 +02:00
Claudio Atzori
6cbda49112
more pervasive use of constants from ModelConstants, especially for ORCID
2021-05-26 18:13:04 +02:00
Miriam Baglioni
abd88f663d
changed test resource to mirror change in the input file
2021-05-21 15:20:47 +02:00
Miriam Baglioni
c844877de2
changed workflow flow to possibly parallelize also the programme and project preparation steps
2021-05-21 14:41:57 +02:00
Miriam Baglioni
073d76864d
refactoring
2021-05-21 14:41:03 +02:00
Miriam Baglioni
4c8b4a774c
removed not needed code
2021-05-21 14:40:07 +02:00
Miriam Baglioni
53b9d87fec
new prepareProgramme according to the new file
2021-05-21 11:49:31 +02:00
Miriam Baglioni
1ee8f13580
refactoring and added "left" as join type to be 100% sure to get the whole set of projects
2021-05-21 11:49:05 +02:00
Miriam Baglioni
e07c3ba089
due to change in the input file the filtering step is no more needed
2021-05-21 11:47:43 +02:00
Miriam Baglioni
54f6e2f693
changed to get the needed information to build the action set as parallel jobs
2021-05-21 11:47:00 +02:00
Miriam Baglioni
7180505519
removed non needed variable
2021-05-21 11:46:13 +02:00
Miriam Baglioni
2eb1a8b344
changed because the input file changed
2021-05-21 11:40:20 +02:00
Miriam Baglioni
9610224671
added param to workflow property
2021-05-20 18:21:12 +02:00
Miriam Baglioni
aa45b4df9b
-
2021-05-20 15:57:40 +02:00
Miriam Baglioni
052c837843
-
2021-05-20 15:54:44 +02:00
Claudio Atzori
ea9b00ce56
adjusted test
2021-05-20 15:31:42 +02:00
Claudio Atzori
2e70aa43f0
Merge pull request 'H2020Classification fix and possibility to add datasources in blacklist for propagation of result to organization' ( #108 ) from miriam.baglioni/dnet-hadoop:master into master
...
Reviewed-on: D-Net/dnet-hadoop#108
The changes look ok, but please drop a comment to describe how the parameters should be changed from the workflow caller for both workflows
* H2020Classification
* propagation of result to organization
2021-05-20 15:25:05 +02:00
Claudio Atzori
b572f56763
Merge branch 'master' into master
2021-05-20 15:22:35 +02:00
Claudio Atzori
2578b7fbb3
code formatting
2021-05-20 14:59:02 +02:00
Miriam Baglioni
dc0ad8d2e0
fixed issue related to change in the file name downloaded. Added sheet name as parameter and also a check if the name should change
2021-05-20 14:53:53 +02:00
Claudio Atzori
aef2977ad0
fixes #6701 : xpath for titles to support both datacite and Guidelines v4 mapping
2021-05-20 14:40:22 +02:00
Miriam Baglioni
02b80cf24f
resolved conflicts
2021-05-20 10:59:39 +02:00
Claudio Atzori
239d0f0a9a
ROR actionset import workflow backported from branch stable_ids
2021-05-18 16:12:11 +02:00