Commit Graph

1968 Commits

Author SHA1 Message Date
Enrico Ottonello 70cb100647 added updating last orcid dataset folders after completion 2021-03-01 10:17:04 +01:00
Enrico Ottonello bd3b16402b added result typologies 2021-03-01 10:16:02 +01:00
Enrico Ottonello ca1800510a Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2021-02-25 18:45:02 +01:00
Enrico Ottonello 53d7023460 dateOfCollection taken from orcid last_update.txt on hdfs; cleaned wf parameters 2021-02-25 18:43:29 +01:00
Enrico Ottonello d43ea88caf aligned orcid result typologies with openaire vocabulary 2021-02-25 15:02:10 +01:00
Enrico Ottonello 975823b968 data from last updated orcid 2021-02-23 15:35:04 +01:00
Alessia Bardi 32e81c2d89 non validated rel has null value in validated field 2021-02-16 11:01:42 +01:00
Michele Artini 83d815d0bc only stats 2021-02-11 10:57:23 +01:00
Michele Artini 8c836bf930 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2021-02-11 10:54:41 +01:00
Michele Artini 8c1600398a added resumeFrom parameter 2021-02-11 10:54:16 +01:00
Claudio Atzori 3f8f78cbfb Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2021-02-11 09:36:10 +01:00
Claudio Atzori b34b5a39ca index field authoridtypevalue mixes up different author id-type value pairs, dropped in favour of orcidtypevalue 2021-02-11 09:36:04 +01:00
Michele Artini 7249cceb53 switch of 2 nodes 2021-02-11 09:27:08 +01:00
Alessia Bardi 986dd969d3 use the proper import for Lists 2021-02-10 12:03:54 +01:00
Alessia Bardi c4d1feca74 mapper test with validated link to project 2021-02-10 11:22:54 +01:00
Alessia Bardi 09fc7e2f78 serialization of validated flag on relationships 2021-02-10 11:22:09 +01:00
Enrico Ottonello ee4ba7298b fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
Claudio Atzori bc458d1b54 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2021-02-09 16:27:30 +01:00
Claudio Atzori 82e6c50f3f updated solr fields (authoridtypevalue, resultsubject, resultresourcetypename) 2021-02-09 16:27:04 +01:00
Claudio Atzori 62bd3c53ee Merge branch 'master' into provision_indexing 2021-02-09 15:46:26 +01:00
Enrico Ottonello c238561001 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2021-02-04 10:44:21 +01:00
Enrico Ottonello 465ce39f75 job execution now based on file last_update.txt on hdfs 2021-02-04 10:44:04 +01:00
Alessia Bardi c67329d3ad updated test for EU Open Data portal datasets 2021-02-03 17:06:48 +01:00
Alessia Bardi fd705404a1 tests for EU Open Data portal dataset mapping 2021-02-03 10:28:17 +01:00
Claudio Atzori f1a852f278 align usage-stats workflow poms with latest snapshot version 2021-01-26 15:42:42 +01:00
Claudio Atzori 9c32119dc2 Merge pull request 'usage-stats-export-wf-v2' (#89) from dimitris.pierrakos/dnet-hadoop:usage-stats-export-wf-v2 into master
Thank you Dimitris!
2021-01-26 15:01:41 +01:00
Claudio Atzori 885e0dd926 [Cleaning] filter authors not providing word characters in the fullname 2021-01-26 09:48:53 +01:00
Claudio Atzori 2890511613 [Cleaning] normalise missing Result.country 2021-01-26 09:41:44 +01:00
Claudio Atzori 4eb9ed35b1 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2021-01-25 18:12:24 +01:00
Claudio Atzori cd379eb5e3 [Cleaning] trying to avoid NPEs, this time by ruling out authors without a defined fullname 2021-01-25 18:11:49 +01:00
Alessia Bardi 505477f36f format code 2021-01-25 18:02:49 +01:00
Alessia Bardi ded6ed8d7d no ',' author, if there are no author in ODF records 2021-01-25 17:57:51 +01:00
Claudio Atzori 3465c8ccee [Cleaning] trying to avoid NPEs 2021-01-25 16:54:53 +01:00
Claudio Atzori 07a0ccfc96 [Cleaning] trying to avoid NPEs 2021-01-25 13:36:01 +01:00
Claudio Atzori 34d653de41 [Cleaning] updated cleaning rule for DOIs 2021-01-22 14:16:33 +01:00
Dimitris 3e8d2a6b2d Clean workflows 2021-01-15 16:19:12 +02:00
Michele Artini cfbcdc95bc fixed a wf param 2021-01-14 14:45:23 +01:00
Michele Artini 69ba3203c0 fixed a conflict 2021-01-14 14:43:25 +01:00
Michele Artini b230d44411 fixed conflict 2021-01-14 14:32:31 +01:00
Michele Artini b9d90e95b8 Added eventId to ShortEventMessage 2021-01-14 14:32:31 +01:00
Michele Artini 64b0b0bfb3 fixed a bug with invalid subject topic 2021-01-14 14:32:31 +01:00
Michele Artini e3e0ab1de1 fixed a problem with join 2021-01-14 14:32:31 +01:00
Michele Artini 26a941315a openaireId 2021-01-14 14:32:31 +01:00
Michele Artini 6f4d1a37f0 ES wf properties 2021-01-14 14:32:31 +01:00
Michele Artini 1391341d06 mkdir of output dir 2021-01-14 14:32:31 +01:00
Michele Artini 3c9cbd19f3 whitelist of topics 2021-01-14 14:32:31 +01:00
Michele Artini 467aa77279 workingDir and outputDir 2021-01-14 14:32:31 +01:00
Michele Artini 10f3f7eca7 workingDir and outputDir 2021-01-14 14:32:31 +01:00
Michele Artini ff41a7b3a4 gzipped output 2021-01-14 14:32:31 +01:00
Claudio Atzori 80cf55ef2e [Broker] fixed partitionEventsByOpendoarIds workflow parameter names 2021-01-13 16:24:30 +01:00