Commit Graph

2215 Commits

Author SHA1 Message Date
Claudio Atzori 73d772a4b4 added method to list the known vocabulary names 2021-02-02 10:39:47 +01:00
Claudio Atzori 8eaa1fd4b4 WIP: metadata collection in INCREMENTAL mode and relative test 2021-02-01 19:29:10 +01:00
Sandro La Bruzzo bead34d11a code refactor 2021-02-01 14:58:06 +01:00
Sandro La Bruzzo 6ff234d81b Implemented a first prototype of incremental harvesting and trasformation using readlock 2021-02-01 13:56:05 +01:00
Sandro La Bruzzo b6b835ef49 update transformation Factory to get Transformation Rule by Id and not by Title 2021-02-01 08:49:42 +01:00
Sandro La Bruzzo e423634cb6 RollBack in case of error WORKS!!! 2021-01-29 17:21:42 +01:00
Sandro La Bruzzo 8ee82576c6 Collection on Refresh WORKS!!! 2021-01-29 17:02:46 +01:00
Sandro La Bruzzo 0276180039 WIP mdstore
transaction implemented on hadoop side
2021-01-29 16:42:41 +01:00
Michele Artini d942d0c77d methods toString(), hashCode() and equals() 2021-01-29 13:16:48 +01:00
Sandro La Bruzzo 0f8e2ecce6 Merged Datacite transfrom into this branch 2021-01-29 10:45:07 +01:00
Sandro La Bruzzo 99cf3a8ea4 Merged Datacite transfrom into this branch 2021-01-28 16:34:46 +01:00
Sandro La Bruzzo 2da8bf7429 Merge pull request 'aggregation_on_hadoop' (#91) from sandro.labruzzo/dnet-hadoop:aggregation_on_hadoop into hadoop_aggregator
ok
2021-01-28 10:06:49 +01:00
Sandro La Bruzzo 686e7b507c Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into aggregation_on_hadoop 2021-01-28 10:02:13 +01:00
Sandro La Bruzzo 98b9498b57 Removed old messaging system not quite used from collection and Transformation workflow
code refactor
2021-01-28 09:51:17 +01:00
Michele Artini 38f2508c87 new fields in mdstore beans 2021-01-28 08:24:45 +01:00
Sandro La Bruzzo 184e7b3856 Implemented new Transformation using spark 2021-01-27 15:43:08 +01:00
Sandro La Bruzzo 150a617bd1 Merge pull request 'aggregation_on_hadoop' (#90) from sandro.labruzzo/dnet-hadoop:aggregation_on_hadoop into hadoop_aggregator
Wonderfull code... You're the Best Sandro
2021-01-26 16:00:47 +01:00
Claudio Atzori f1a852f278 align usage-stats workflow poms with latest snapshot version 2021-01-26 15:42:42 +01:00
Claudio Atzori 9c32119dc2 Merge pull request 'usage-stats-export-wf-v2' (#89) from dimitris.pierrakos/dnet-hadoop:usage-stats-export-wf-v2 into master
Thank you Dimitris!
2021-01-26 15:01:41 +01:00
Claudio Atzori 885e0dd926 [Cleaning] filter authors not providing word characters in the fullname 2021-01-26 09:48:53 +01:00
Claudio Atzori 2890511613 [Cleaning] normalise missing Result.country 2021-01-26 09:41:44 +01:00
Claudio Atzori 4eb9ed35b1 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2021-01-25 18:12:24 +01:00
Claudio Atzori cd379eb5e3 [Cleaning] trying to avoid NPEs, this time by ruling out authors without a defined fullname 2021-01-25 18:11:49 +01:00
Alessia Bardi 505477f36f format code 2021-01-25 18:02:49 +01:00
Alessia Bardi ded6ed8d7d no ',' author, if there are no author in ODF records 2021-01-25 17:57:51 +01:00
Claudio Atzori 3465c8ccee [Cleaning] trying to avoid NPEs 2021-01-25 16:54:53 +01:00
Sandro La Bruzzo a54848a59c Moved Vocabulary stuff to common module 2021-01-25 15:43:04 +01:00
Sandro La Bruzzo ffb092b8d3 removed duplicate code HttpConnector.java 2021-01-25 15:05:37 +01:00
Sandro La Bruzzo cda210a2ca changed documentation since it didn't reflect the current status 2021-01-25 14:17:42 +01:00
Claudio Atzori 07a0ccfc96 [Cleaning] trying to avoid NPEs 2021-01-25 13:36:01 +01:00
Claudio Atzori 646dab7f68 trying to avoid NPEs 2021-01-22 18:24:34 +01:00
Claudio Atzori 34d653de41 [Cleaning] updated cleaning rule for DOIs 2021-01-22 14:16:33 +01:00
Dimitris 3e8d2a6b2d Clean workflows 2021-01-15 16:19:12 +02:00
Michele Artini f667e94a31 Merge pull request 'broker' (#88) from broker into master 2021-01-14 14:48:13 +01:00
Michele Artini cfbcdc95bc fixed a wf param 2021-01-14 14:45:23 +01:00
Michele Artini 69ba3203c0 fixed a conflict 2021-01-14 14:43:25 +01:00
Michele Artini fafb5b2e08 Merge branch 'broker' of code-repo.d4science.org:D-Net/dnet-hadoop into broker 2021-01-14 14:32:42 +01:00
Michele Artini b230d44411 fixed conflict 2021-01-14 14:32:31 +01:00
Michele Artini b9d90e95b8 Added eventId to ShortEventMessage 2021-01-14 14:32:31 +01:00
Michele Artini 64b0b0bfb3 fixed a bug with invalid subject topic 2021-01-14 14:32:31 +01:00
Michele Artini e3e0ab1de1 fixed a problem with join 2021-01-14 14:32:31 +01:00
Michele Artini 26a941315a openaireId 2021-01-14 14:32:31 +01:00
Michele Artini 6f4d1a37f0 ES wf properties 2021-01-14 14:32:31 +01:00
Michele Artini 1391341d06 mkdir of output dir 2021-01-14 14:32:31 +01:00
Michele Artini 3c9cbd19f3 whitelist of topics 2021-01-14 14:32:31 +01:00
Michele Artini 467aa77279 workingDir and outputDir 2021-01-14 14:32:31 +01:00
Michele Artini 10f3f7eca7 workingDir and outputDir 2021-01-14 14:32:31 +01:00
Michele Artini ff41a7b3a4 gzipped output 2021-01-14 14:32:31 +01:00
Michele Artini 223fa660cb fixed conflict 2021-01-14 14:23:44 +01:00
Michele Artini ac91e495fc Added eventId to ShortEventMessage 2021-01-14 13:20:35 +01:00