1
0
Fork 0
Commit Graph

195 Commits

Author SHA1 Message Date
Sandro La Bruzzo b3f5c2351d Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into hadoop_aggregator
 Conflicts:
	dhp-workflows/dhp-aggregation/src/test/java/eu/dnetlib/dhp/transformation/TransformationJobTest.java
2021-02-12 16:37:14 +01:00
Sandro La Bruzzo f216277219 Implemented cleaning date 2021-02-12 16:34:52 +01:00
Andreas Czerniak 5a9017cf18 clone, min. changes, test, run 2021-02-12 14:32:36 +01:00
Claudio Atzori aa55dedb8a Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator 2021-02-12 12:31:05 +01:00
Claudio Atzori 29c6f7e255 classes related to the collection workflow moved into common package; implemented MongoDB collection plugins 2021-02-12 12:31:02 +01:00
Sandro La Bruzzo 17e6f1934e fixed NPE on cleaner 2021-02-12 11:48:11 +01:00
Claudio Atzori 50add4c61b added requestDelay to HttpConnector2 configuration; Aggregation workflow constants moved in dhp-common 2021-02-08 12:19:38 +01:00
Claudio Atzori 40df0f987d better logging, WIP: collectorWorker error reporting; common functions moved in DHPUtils 2021-02-06 20:12:00 +01:00
Claudio Atzori a8a758925e better logging, WIP: collectorWorker error reporting 2021-02-05 19:18:05 +01:00
Claudio Atzori 730973679a Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator 2021-02-04 17:25:00 +01:00
Claudio Atzori deb85706db imported HttpConnector from https://svn.driver.research-infrastructures.eu/driver/dnet45/modules/dnet-modular-collector-service/trunk/src/main/java/eu/dnetlib/data/collector/plugins/HttpConnector.java as HttpConnector2 2021-02-04 17:24:52 +01:00
Sandro La Bruzzo 4dae5e605d implemented messaging btween collection worker and Dnet 2021-02-04 15:51:15 +01:00
Claudio Atzori 40764cf626 better logging, WIP: collectorWorker error reporting 2021-02-04 14:06:02 +01:00
Sandro La Bruzzo 69c253710b fixed test 2021-02-04 10:30:49 +01:00
Claudio Atzori 0e8a4f9f1a better logging, WIP: collectorWorker error reporting 2021-02-03 12:33:41 +01:00
Claudio Atzori bb89b99b24 code formatting 2021-02-02 12:34:14 +01:00
Claudio Atzori 75807ea5ae factored out constants 2021-02-02 12:28:21 +01:00
Sandro La Bruzzo 0634674add implemented transformation test 2021-02-02 12:12:14 +01:00
Claudio Atzori 8eaa1fd4b4 WIP: metadata collection in INCREMENTAL mode and relative test 2021-02-01 19:29:10 +01:00
Sandro La Bruzzo 6ff234d81b Implemented a first prototype of incremental harvesting and trasformation using readlock 2021-02-01 13:56:05 +01:00
Sandro La Bruzzo 8ee82576c6 Collection on Refresh WORKS!!! 2021-01-29 17:02:46 +01:00
Sandro La Bruzzo 0276180039 WIP mdstore
transaction implemented on hadoop side
2021-01-29 16:42:41 +01:00
Sandro La Bruzzo 98b9498b57 Removed old messaging system not quite used from collection and Transformation workflow
code refactor
2021-01-28 09:51:17 +01:00
Sandro La Bruzzo 184e7b3856 Implemented new Transformation using spark 2021-01-27 15:43:08 +01:00
Sandro La Bruzzo ffb092b8d3 removed duplicate code HttpConnector.java 2021-01-25 15:05:37 +01:00
Claudio Atzori 41500669e2 [BIP! Scores integration] merged missing classes from bipFinder branch 2021-01-11 14:39:47 +01:00
Claudio Atzori 03319d3bd9 Revert "Merge pull request 'Creation of the action set to include the bipFinder! score' (#62) from miriam.baglioni/dnet-hadoop:bipFinder into master"
This reverts commit add7e1693b, reversing
changes made to f9a8fd8bbd.
2020-12-17 12:23:58 +01:00
Miriam Baglioni 5b3ed70808 refactoring 2020-12-01 14:31:34 +01:00
Miriam Baglioni 0051ebede5 extending test 2020-12-01 12:43:03 +01:00
Miriam Baglioni 719da15f04 added test resources 2020-12-01 12:42:30 +01:00
Miriam Baglioni db36e11912 classes test classes and resources for production of the actionset to include bipFinder score in results 2020-11-30 20:14:23 +01:00
Sandro La Bruzzo 66efb39634 implemented merge scholix 2020-11-04 09:04:01 +01:00
Miriam Baglioni 4905739be6 changed resource file to mirror change in business logic 2020-10-30 17:02:57 +01:00
Miriam Baglioni b40360ebfb changed the code to mirror the changed decision in the classification level and prodramme description labels 2020-10-30 17:02:30 +01:00
Miriam Baglioni 696409fb9f disabled tests because needing remote resource 2020-10-30 17:01:48 +01:00
Miriam Baglioni a2ce527fae changed to match the requirements for short titles in level and long titles in classification 2020-10-20 17:03:25 +02:00
Miriam Baglioni 61946b4092 refactoring 2020-10-01 16:22:48 +02:00
Miriam Baglioni 7e6d35e56c added the link to the excel file related to topic 2020-10-01 15:53:31 +02:00
Miriam Baglioni 632351c0da modified test resources to mirror the changed in the code 2020-10-01 15:43:02 +02:00
Miriam Baglioni ebc1c5513f modified test resources to mirror the changed in the code 2020-10-01 15:42:29 +02:00
Miriam Baglioni 83ea746163 added check to the test 2020-10-01 15:40:28 +02:00
Miriam Baglioni a46179f61c refactoring 2020-10-01 11:22:01 +02:00
Miriam Baglioni c107f193c9 refactoring 2020-10-01 11:16:22 +02:00
Miriam Baglioni 706a80a29a added test to check that separator '-' (not hyphen) will be recognized 2020-10-01 10:38:31 +02:00
Miriam Baglioni 3dca586b3b refactoring 2020-10-01 10:34:48 +02:00
Miriam Baglioni f4739a371a code to get the information related to the topic association between code and description. 2020-09-28 12:02:48 +02:00
Miriam Baglioni 12c2dfc268 modified the resource to consider the information added to the model 2020-09-25 14:17:23 +02:00
Miriam Baglioni 969fa8d96e fixed issue and changed the transformation of the programme file to consider the new model 2020-09-25 13:32:34 +02:00
Miriam Baglioni e917281822 - 2020-09-24 15:24:05 +02:00
Miriam Baglioni 1d84cf19a6 added new line to resource file 2020-09-23 17:32:22 +02:00
Miriam Baglioni f0c476b6c9 modification to the test classes to consider h2020classification 2020-09-23 17:31:49 +02:00
Miriam Baglioni 1069cf243a modification to the schema to consider the H2020classification of the programme. The filed Programme has been moved inside the H2020classification that is now associated to the Project. Programme is no more associated directly to the Project but via H2020CLassification 2020-09-22 14:38:00 +02:00
Claudio Atzori 306669209f code formatting 2020-06-16 16:54:44 +02:00
Claudio Atzori 603b1bd0bb Merge branch 'master' into dhp_oaf_model 2020-06-16 15:43:59 +02:00
Claudio Atzori a2fdf85ba1 WIP: graph cleaner implementation 2020-06-09 19:52:53 +02:00
Miriam Baglioni dfa4997a4f removed commented code 2020-05-29 10:45:18 +02:00
Miriam Baglioni 8b6e886fb6 added new resource for testing 2020-05-28 23:54:31 +02:00
Miriam Baglioni 6989fb9c8a changed the project test according to the newly introduced join with the db project codes 2020-05-28 23:53:24 +02:00
Miriam Baglioni 01f7876595 fix issue with flatMap - the return type must not be null 2020-05-28 23:50:32 +02:00
Miriam Baglioni 35b7279147 changed test because data are saved as SequenceFile now, and because of the group by the umber of produced update decrease 2020-05-28 10:26:12 +02:00
Miriam Baglioni ac8025f469 - 2020-05-22 15:29:41 +02:00
Miriam Baglioni 50ad83b97f - 2020-05-22 15:27:19 +02:00
Miriam Baglioni 055eec5a77 added resource for prepare project test 2020-05-20 13:54:10 +02:00
Miriam Baglioni 9079bc1f61 - 2020-05-20 13:53:32 +02:00
Miriam Baglioni 67ba4fde57 added test for prepare projects step 2020-05-20 13:53:08 +02:00
Miriam Baglioni 3c0eb12d3e removed the not zipped files 2020-05-20 10:31:05 +02:00
Miriam Baglioni c0d9e02340 zipped test resources that are too big 2020-05-20 10:30:25 +02:00
Miriam Baglioni 5e9c9fa87c tests 2020-05-20 10:29:57 +02:00
Miriam Baglioni faed7521bf added resources for testing 2020-05-20 10:29:29 +02:00
Miriam Baglioni 457293ccc0 test for the variuos steps of project update with programme 2020-05-19 18:43:42 +02:00
Miriam Baglioni 9447d78ef3 added preparation classes 2020-05-19 18:42:50 +02:00
Claudio Atzori 0825321d0b improved unit tests in dhp-aggregation 2020-05-05 12:39:04 +02:00
Claudio Atzori 439c6255a2 cleanup 2020-04-29 19:09:07 +02:00
Claudio Atzori 6f5b899038 reformatted code according to the updated style descriptor 2020-04-28 11:23:29 +02:00
Claudio Atzori a0bdbacdae switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:52:31 +02:00
Claudio Atzori 7a3f8085f7 switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:45:40 +02:00
Claudio Atzori ad7a131b18 introduced common project code formatting plugin, works on the commit hook, based on https://github.com/Cosium/git-code-format-maven-plugin, applied to each java class in the project 2020-04-18 12:42:58 +02:00
Claudio Atzori 6b5f9ca9cb raw graph creation workflow moved under dhp-graph-mapper, claims integration is included 2020-04-10 17:53:07 +02:00
Michele Artini f6e86b44a6 tests 2020-03-27 11:46:37 +01:00
Claudio Atzori c0e825e713 dhp-aggregation workflow tests upgraded to junit5 2020-03-25 17:59:45 +01:00
Michele Artini ebe45003d9 fixed some junit packages 2020-03-25 16:45:03 +01:00
Michele Artini 2559299da4 tests 2020-03-25 12:25:00 +01:00
Michele Artini 0fda2c3a30 some tests on db records 2020-03-25 09:43:58 +01:00
Michele Artini b6efa9d6ab Configuration of the SequenceFile Writer 2020-03-05 15:49:14 +01:00
Sandro La Bruzzo abd9034da0 implemented DedupRecord factory with the merge of publications 2019-12-11 15:43:24 +01:00
miconis 4b66b471a4 implementation of the sorting by trust mechanism and the merge of oaf entities 2019-12-10 14:57:16 +01:00
Sandro La Bruzzo cc63706347 Implemented deduplication on spark 2019-12-06 13:38:00 +01:00
Claudio Atzori 1e7a2ac41d align parmeter names, graph import procedure WIP 2019-11-04 17:41:01 +01:00
Claudio Atzori c8bb81cd9a align dependencies with IIS cluster 2019-10-29 18:10:20 +01:00
Sandro La Bruzzo 5a8a323f2a dhp-collection-worker integrated in dhp-workflows 2019-10-24 11:36:59 +02:00
Sandro La Bruzzo bbb87d0e3d implemented saxonHE on transformation spark job 2019-10-10 11:33:51 +02:00
Sandro La Bruzzo 403c13eebf Implemented message manager, Fixed bug on collection worker, implemented Collecion and Transform spark job 2019-04-11 15:39:29 +02:00
Sandro La Bruzzo ded6aef5e1 moved collector worker 2019-04-03 16:05:16 +02:00
Sandro La Bruzzo 6156562893 Added test 2019-03-18 10:47:28 +01:00
Sandro La Bruzzo e67d9ee1a9 added first implementation of dnet-workflows 2019-03-18 10:44:35 +01:00