Commit Graph

3663 Commits

Author SHA1 Message Date
Sandro La Bruzzo 8920932dd8 Code formatted 2023-02-08 11:34:18 +01:00
Sandro La Bruzzo 0b9819f1ab Code formatted 2023-02-08 10:32:33 +01:00
Sandro La Bruzzo 6c81a161d2 Merge remote-tracking branch 'origin/beta' into 8231-mdstore-synch-improve 2023-02-08 10:29:09 +01:00
dimitrispie 98c34263ed Update step20-createMonitorDB.sql
Add University of Cape Town organization
2023-02-07 08:14:48 +02:00
dimitrispie 973d78a4d6 Update step15_5.sql
Added unpaywalls open access colors
2023-02-02 08:03:54 +02:00
Claudio Atzori d05ca53a14 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-31 14:39:53 +01:00
Miriam Baglioni e82e009b46 added missing close tag for XML produced by the xquery to get information for the community from the IS 2023-01-31 10:19:34 +01:00
Miriam Baglioni b254a0375f [Affiliation from institutionalrepo] changed the field to check to verify the datasource type. Now it is in the field jurisdiction 2023-01-26 16:51:20 +01:00
dimitrispie db7d625ba9 Addedd Arts et Métiers ParisTech organization 2023-01-25 12:22:21 +02:00
Claudio Atzori 505867bce9 [bulk tagging] better node naming 2023-01-20 16:13:16 +01:00
Miriam Baglioni ecd398fe51 refactoring 2023-01-20 14:23:45 +01:00
Miriam Baglioni 0a5c6010b0 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-13 16:14:46 +01:00
dimitrispie dd70c32ad7 Bug fixes 2023-01-12 17:18:05 +02:00
dimitrispie 51f7ab5864 Bug fixes 2023-01-12 17:15:06 +02:00
dimitrispie 686580a220 - New Monitor DB workflow
- New Organization added
2023-01-12 11:18:03 +02:00
Claudio Atzori 0a58bc7ba7 [broker] prevent NPEs 2023-01-11 14:44:14 +01:00
Claudio Atzori 04cb96001c [broker] d40e20f437 adapted to the beta graph model 2023-01-11 10:10:12 +01:00
Michele Artini 91b845f611 Considering instance pids and alteternative identifiers 2023-01-11 09:58:54 +01:00
Miriam Baglioni 1f367122e4 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-11 09:47:44 +01:00
Michele Artini 7b7520850b fixed an invalid char 2023-01-11 09:22:18 +01:00
Miriam Baglioni d6895f0387 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-09 17:28:38 +01:00
dimitrispie becb242c17 Monitor DB only Workflow 2023-01-04 16:50:29 +02:00
dimitrispie 592013d5dd Added more steps in decision node 2022-12-23 09:43:16 +02:00
dimitrispie 6449ff4207 1. Added a decision node to enables the workflow to make a selection on the execution path to follow
2. Added new organization
3. Added 5 new tables from Eurostast
2022-12-22 10:18:21 +02:00
Miriam Baglioni 8893389895 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-12-21 12:42:27 +01:00
Sandro La Bruzzo 3c9826f186 updated lines function to it's implementation linesWithSeparators.map(l => l.stripLineEnd) in this way we force scala plugin compiler to consider this pipeline scala code and not java.string.lines() pipeline 2022-12-21 11:21:17 +01:00
Claudio Atzori 6aa91204a5 [orcid propagation] skip empty directories 2022-12-20 14:15:46 +01:00
Miriam Baglioni 6674cccb94 [BulkTag] description of parameters more comprehensive for those who do not implement it 2022-12-16 15:33:20 +01:00
Miriam Baglioni f37113a941 [BulkTag] moving xquery to get community configuration in dedicated file 2022-12-16 15:32:26 +01:00
Miriam Baglioni 8685eaa706 [Clean Country] added test to verify remove of country 2022-12-16 15:31:25 +01:00
Miriam Baglioni dc0ec88a58 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-12-16 13:18:32 +01:00
Miriam Baglioni d791840b82 [Clean Country] added test to verify remove of country: 2022-12-16 13:18:29 +01:00
Claudio Atzori 7b80b24f82 [cleaning] country cleaning must use both PID and AlternateIdentifier fields 2022-12-15 14:49:04 +01:00
Claudio Atzori b8bafab8a0 [cleaning] improved vocabulary based mapping, specialization for the strict vocab cleaning 2022-12-12 14:43:03 +01:00
Sandro La Bruzzo 5e4866d033 implemented synch for single mdstore 2022-12-12 11:29:46 +01:00
Claudio Atzori c18b8048c3 [cleaning] avoid NPE 2022-12-10 11:41:38 +01:00
Claudio Atzori 8b44afe5e5 [cleaning] avoid NPE 2022-12-09 15:44:57 +01:00
Claudio Atzori 389dd25430 [cleaning] avoid NPE 2022-12-08 18:40:48 +01:00
Claudio Atzori 730228d73d [cleaning] align wf parameter names in test 2022-12-08 18:40:22 +01:00
Claudio Atzori 2094fa6db0 [cleaning] align wf parameter names 2022-12-08 17:22:26 +01:00
Miriam Baglioni a485a94956 [Cleaning] fixed parameter name in property file 2022-12-08 16:59:34 +01:00
Miriam Baglioni 3d99b78d94 [Cleaning] fixed error in parameter (workingPath to workingDir) 2022-12-08 10:25:02 +01:00
Claudio Atzori 1b8488976b code formatting 2022-12-07 10:45:38 +01:00
Claudio Atzori cd1b58483e [bulk tag] fixed Community configuration parsing to void NPE 2022-12-07 10:39:00 +01:00
Claudio Atzori 062abfd669 fixed NPE, removed unused stuff 2022-12-06 12:04:00 +01:00
dimitrispie 2a52a42169 Added 4 institutions:
-University of Modena and Reggio Emilia
-Bilkent University
-Saints Cyril and Methodius University of Skopje
-University of Milan
2022-12-06 10:10:21 +02:00
Claudio Atzori 8248da40d9 Merge branch 'beta' into graph_cleaning 2022-12-02 14:49:00 +01:00
Claudio Atzori ddf065756f Merge pull request 'Two organizations are added for monitor' (#258) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #258
2022-12-02 14:45:27 +01:00
Sandro La Bruzzo 5a48a2fb18 implemented synch for single mdstore 2022-12-01 11:34:43 +01:00
Claudio Atzori a38116546d Merge branch 'beta' into deduptesting 2022-11-30 11:27:29 +01:00
Miriam Baglioni ce020f2c83 [EOSC FUTURE] added resources and test for review 2022-11-30 09:57:30 +01:00
Miriam Baglioni bb0ddc1c44 [BulkTag] adding verb starts_with 2022-11-30 09:56:24 +01:00
Claudio Atzori 8e3edba318 [graph cleaning] testing the collectedfron and hostedby patch procedure 2022-11-29 16:07:09 +01:00
Claudio Atzori 58c05731f9 [graph cleaning] WIP: testing the collectedfron and hostedby patch procedure 2022-11-29 11:21:51 +01:00
Miriam Baglioni 9c70c5dbd6 [Bulk Tag horizontal] added new path in definition of constraint (to recognize fos subjects) - changed test and resource class to test this new aspect 2022-11-28 14:51:20 +01:00
Miriam Baglioni 0628df7a3a resolving conflicts 2022-11-28 10:44:56 +01:00
Claudio Atzori 11695ba649 [graph cleaning] patch also the result's collectedfrom and hostedby datasource name according to the datasource master-duplicate mapping 2022-11-28 10:18:43 +01:00
Claudio Atzori 6082d235d3 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into graph_cleaning 2022-11-28 09:54:48 +01:00
Claudio Atzori 24ef301cc1 [graph cleaning] patch the result's collectedfrom and hostedby identifiers according to the datasource master-duplicate mapping 2022-11-28 09:54:18 +01:00
Alessia Bardi 90c8f9cb61 tests for EOSC Future 2022-11-23 12:18:44 +01:00
Miriam Baglioni 0e3edc5018 [Bulk Tag] fixed issue in verb name 2022-11-23 11:26:36 +01:00
Claudio Atzori a79c47522d updated ORCID datasource identifier 2022-11-23 10:17:49 +01:00
Alessia Bardi 2832117f23 added eoscifguidelines in test 2022-11-22 18:01:12 +01:00
Alessia Bardi 3c08269a4d Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-11-22 17:31:00 +01:00
Alessia Bardi 2687fc9f73 tests for EOSC Future review - ROhub 2022-11-22 17:30:56 +01:00
Claudio Atzori 1d5143b0b6 Merge branch 'beta' into deduptesting 2022-11-22 10:21:30 +01:00
Claudio Atzori 0aa725083f extended dedup testing 2022-11-17 16:13:43 +01:00
Claudio Atzori 3dbc637d3e code formatting 2022-11-17 09:55:41 +01:00
Claudio Atzori ddff0e8999 merging duplicates using IdentifierComparator 2022-11-11 16:10:25 +01:00
Claudio Atzori 5af5a8ae42 added IdentifierComparator 2022-11-09 14:20:59 +01:00
Claudio Atzori 7c3390ac10 Merge branch 'beta' into eoscifguidelines-from-mdstores 2022-11-07 12:18:40 +01:00
dimitrispie 992fc5b628 Added McMaster University Institution 2022-11-03 11:02:18 +02:00
dimitrispie 7fda05e380 Added Autonomous University of Barcelona 2022-11-01 13:59:40 +02:00
Claudio Atzori 22873c9172 Merge pull request 'Added fields: totalcost, fundedamount, currency, in project table' (#257) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #257
2022-10-31 13:49:27 +01:00
dimitrispie 7861c472e0 Hive memory parameters 2022-10-28 19:00:32 +03:00
dimitrispie 5df9c63963 Added fields: totalcost, fundedamount, currency, in project table 2022-10-27 16:44:26 +03:00
Sandro La Bruzzo 2b9a20a4a3 Changed the way Scholexplorer filter the relationships, I found that filter all relation coming from openCitation is wrong, because we loose a lot of relation than intersect OpenCitation, but they don't come only from there 2022-10-24 12:53:47 +02:00
Alessia Bardi 208ed32315 fixed xpath for semantic relation 2022-10-23 18:18:13 +02:00
Alessia Bardi ee759ac92d file format after mvn compile 2022-10-23 18:09:47 +02:00
Alessia Bardi 31a10f000b Map the field oaf:eoscifguidelines from mdstores. Currently we can find it in ROHub metadata 2022-10-23 18:05:37 +02:00
Claudio Atzori ec39b84898 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-19 15:21:02 +02:00
Claudio Atzori bca4a61710 suppressing hyper verbose spark logs during unit test execution 2022-10-19 15:20:58 +02:00
Sandro La Bruzzo 72f0d88d6c formatted code 2022-10-19 14:18:42 +02:00
Claudio Atzori 9b449110c6 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-14 15:48:04 +02:00
Claudio Atzori ae7cd0735a [graph2hive] more partitions 2022-10-14 15:47:58 +02:00
Sandro La Bruzzo 135cf81151 Merge remote-tracking branch 'origin/beta' into beta 2022-10-13 11:47:25 +02:00
Sandro La Bruzzo a1f94530a3 added documentation 2022-10-13 11:47:11 +02:00
Claudio Atzori b47aaf4dd1 [cleaning] subjects declared as belonging to specific vocabularies whose values are not found in the vocab are set to type keyword 2022-10-13 11:23:43 +02:00
Claudio Atzori 6163ecbf63 [cleaning] renamed parameters in wf action 2022-10-11 11:20:03 +02:00
Claudio Atzori b301e9fdff [cleaning] renamed action name/description 2022-10-11 11:08:52 +02:00
Claudio Atzori ece40adc09 [cleaning] fixing NPE in the country cleaning phase 2022-10-11 10:10:20 +02:00
Claudio Atzori d51275a965 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-07 09:52:49 +02:00
Claudio Atzori 8d97949316 [cleaning] fixed loop in wf nodes 2022-10-07 09:52:45 +02:00
Miriam Baglioni 4d8339614b Revert "[BipFinder] Fixed issue for wrong escaped char in doi"
This reverts commit 188f25eefa.
2022-10-04 14:29:47 +02:00
Miriam Baglioni 7324853a17 Revert "[BipFinder] refactoring"
This reverts commit 28dc317350.
2022-10-04 14:29:39 +02:00
Miriam Baglioni 28dc317350 [BipFinder] refactoring 2022-10-04 09:47:27 +02:00
Miriam Baglioni 188f25eefa [BipFinder] Fixed issue for wrong escaped char in doi 2022-10-03 12:42:52 +02:00
Claudio Atzori 89f7007080 Merge pull request '[stats wf] misc changes' (#254) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #254
2022-10-03 10:32:05 +02:00
dimitrispie 2c0c3f1806 Cast amount to float for table result_apcs 2022-09-28 19:33:24 +03:00
Alessia Bardi 49360770d7 map w3id as instance url 2022-09-28 14:16:39 +02:00