Commit Graph

3672 Commits

Author SHA1 Message Date
Miriam Baglioni 016337a0f9 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-02-16 15:54:59 +01:00
Sandro La Bruzzo 118c1fc3b3 Merge remote-tracking branch 'origin/beta' into beta 2023-02-15 10:29:28 +01:00
Sandro La Bruzzo a8ac79fa25 Added citation relation on crossref Mapping 2023-02-15 10:29:13 +01:00
Claudio Atzori 9a03f71db1 code formatting 2023-02-13 16:25:47 +01:00
Michele Artini 554df257ab null values in date range conditions 2023-02-13 16:15:32 +01:00
Miriam Baglioni 5cf902a2b0 [UsageCount] changed query to make the sum be computed via sql instead of grouping 2023-02-10 16:16:37 +01:00
Miriam Baglioni f803530df6 [UsageCount] fixed query 2023-02-10 15:50:56 +01:00
Miriam Baglioni bb5bba51b3 [UsageCount] extended test 2023-02-09 19:08:30 +01:00
Miriam Baglioni 85e53fad00 [UsageCount] addition of usagecount for Projects and datasources. Extention of the action set created for the results with new entities for projects and datasources. Extention of the resource set and modification of the testing class 2023-02-09 18:59:45 +01:00
Sandro La Bruzzo 8920932dd8 Code formatted 2023-02-08 11:34:18 +01:00
Sandro La Bruzzo 0b9819f1ab Code formatted 2023-02-08 10:32:33 +01:00
Sandro La Bruzzo 6c81a161d2 Merge remote-tracking branch 'origin/beta' into 8231-mdstore-synch-improve 2023-02-08 10:29:09 +01:00
dimitrispie 98c34263ed Update step20-createMonitorDB.sql
Add University of Cape Town organization
2023-02-07 08:14:48 +02:00
dimitrispie 973d78a4d6 Update step15_5.sql
Added unpaywalls open access colors
2023-02-02 08:03:54 +02:00
Claudio Atzori d05ca53a14 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-31 14:39:53 +01:00
Miriam Baglioni e82e009b46 added missing close tag for XML produced by the xquery to get information for the community from the IS 2023-01-31 10:19:34 +01:00
Miriam Baglioni b254a0375f [Affiliation from institutionalrepo] changed the field to check to verify the datasource type. Now it is in the field jurisdiction 2023-01-26 16:51:20 +01:00
dimitrispie db7d625ba9 Addedd Arts et Métiers ParisTech organization 2023-01-25 12:22:21 +02:00
Claudio Atzori 505867bce9 [bulk tagging] better node naming 2023-01-20 16:13:16 +01:00
Miriam Baglioni ecd398fe51 refactoring 2023-01-20 14:23:45 +01:00
Miriam Baglioni 0a5c6010b0 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-13 16:14:46 +01:00
dimitrispie dd70c32ad7 Bug fixes 2023-01-12 17:18:05 +02:00
dimitrispie 51f7ab5864 Bug fixes 2023-01-12 17:15:06 +02:00
dimitrispie 686580a220 - New Monitor DB workflow
- New Organization added
2023-01-12 11:18:03 +02:00
Claudio Atzori 0a58bc7ba7 [broker] prevent NPEs 2023-01-11 14:44:14 +01:00
Claudio Atzori 04cb96001c [broker] d40e20f437 adapted to the beta graph model 2023-01-11 10:10:12 +01:00
Michele Artini 91b845f611 Considering instance pids and alteternative identifiers 2023-01-11 09:58:54 +01:00
Miriam Baglioni 1f367122e4 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-11 09:47:44 +01:00
Michele Artini 7b7520850b fixed an invalid char 2023-01-11 09:22:18 +01:00
Miriam Baglioni d6895f0387 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-01-09 17:28:38 +01:00
dimitrispie becb242c17 Monitor DB only Workflow 2023-01-04 16:50:29 +02:00
dimitrispie 592013d5dd Added more steps in decision node 2022-12-23 09:43:16 +02:00
dimitrispie 6449ff4207 1. Added a decision node to enables the workflow to make a selection on the execution path to follow
2. Added new organization
3. Added 5 new tables from Eurostast
2022-12-22 10:18:21 +02:00
Miriam Baglioni 8893389895 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-12-21 12:42:27 +01:00
Sandro La Bruzzo 3c9826f186 updated lines function to it's implementation linesWithSeparators.map(l => l.stripLineEnd) in this way we force scala plugin compiler to consider this pipeline scala code and not java.string.lines() pipeline 2022-12-21 11:21:17 +01:00
Claudio Atzori 6aa91204a5 [orcid propagation] skip empty directories 2022-12-20 14:15:46 +01:00
Miriam Baglioni 6674cccb94 [BulkTag] description of parameters more comprehensive for those who do not implement it 2022-12-16 15:33:20 +01:00
Miriam Baglioni f37113a941 [BulkTag] moving xquery to get community configuration in dedicated file 2022-12-16 15:32:26 +01:00
Miriam Baglioni 8685eaa706 [Clean Country] added test to verify remove of country 2022-12-16 15:31:25 +01:00
Miriam Baglioni dc0ec88a58 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-12-16 13:18:32 +01:00
Miriam Baglioni d791840b82 [Clean Country] added test to verify remove of country: 2022-12-16 13:18:29 +01:00
Claudio Atzori 7b80b24f82 [cleaning] country cleaning must use both PID and AlternateIdentifier fields 2022-12-15 14:49:04 +01:00
Claudio Atzori b8bafab8a0 [cleaning] improved vocabulary based mapping, specialization for the strict vocab cleaning 2022-12-12 14:43:03 +01:00
Sandro La Bruzzo 5e4866d033 implemented synch for single mdstore 2022-12-12 11:29:46 +01:00
Claudio Atzori c18b8048c3 [cleaning] avoid NPE 2022-12-10 11:41:38 +01:00
Claudio Atzori 8b44afe5e5 [cleaning] avoid NPE 2022-12-09 15:44:57 +01:00
Claudio Atzori 389dd25430 [cleaning] avoid NPE 2022-12-08 18:40:48 +01:00
Claudio Atzori 730228d73d [cleaning] align wf parameter names in test 2022-12-08 18:40:22 +01:00
Claudio Atzori 2094fa6db0 [cleaning] align wf parameter names 2022-12-08 17:22:26 +01:00
Miriam Baglioni a485a94956 [Cleaning] fixed parameter name in property file 2022-12-08 16:59:34 +01:00
Miriam Baglioni 3d99b78d94 [Cleaning] fixed error in parameter (workingPath to workingDir) 2022-12-08 10:25:02 +01:00
Claudio Atzori 1b8488976b code formatting 2022-12-07 10:45:38 +01:00
Claudio Atzori cd1b58483e [bulk tag] fixed Community configuration parsing to void NPE 2022-12-07 10:39:00 +01:00
Claudio Atzori 062abfd669 fixed NPE, removed unused stuff 2022-12-06 12:04:00 +01:00
dimitrispie 2a52a42169 Added 4 institutions:
-University of Modena and Reggio Emilia
-Bilkent University
-Saints Cyril and Methodius University of Skopje
-University of Milan
2022-12-06 10:10:21 +02:00
Claudio Atzori 8248da40d9 Merge branch 'beta' into graph_cleaning 2022-12-02 14:49:00 +01:00
Claudio Atzori ddf065756f Merge pull request 'Two organizations are added for monitor' (#258) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#258
2022-12-02 14:45:27 +01:00
Sandro La Bruzzo 5a48a2fb18 implemented synch for single mdstore 2022-12-01 11:34:43 +01:00
Claudio Atzori a38116546d Merge branch 'beta' into deduptesting 2022-11-30 11:27:29 +01:00
Miriam Baglioni ce020f2c83 [EOSC FUTURE] added resources and test for review 2022-11-30 09:57:30 +01:00
Miriam Baglioni bb0ddc1c44 [BulkTag] adding verb starts_with 2022-11-30 09:56:24 +01:00
Claudio Atzori 8e3edba318 [graph cleaning] testing the collectedfron and hostedby patch procedure 2022-11-29 16:07:09 +01:00
Claudio Atzori 58c05731f9 [graph cleaning] WIP: testing the collectedfron and hostedby patch procedure 2022-11-29 11:21:51 +01:00
Miriam Baglioni 9c70c5dbd6 [Bulk Tag horizontal] added new path in definition of constraint (to recognize fos subjects) - changed test and resource class to test this new aspect 2022-11-28 14:51:20 +01:00
Miriam Baglioni 0628df7a3a resolving conflicts 2022-11-28 10:44:56 +01:00
Claudio Atzori 11695ba649 [graph cleaning] patch also the result's collectedfrom and hostedby datasource name according to the datasource master-duplicate mapping 2022-11-28 10:18:43 +01:00
Claudio Atzori 6082d235d3 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into graph_cleaning 2022-11-28 09:54:48 +01:00
Claudio Atzori 24ef301cc1 [graph cleaning] patch the result's collectedfrom and hostedby identifiers according to the datasource master-duplicate mapping 2022-11-28 09:54:18 +01:00
Alessia Bardi 90c8f9cb61 tests for EOSC Future 2022-11-23 12:18:44 +01:00
Miriam Baglioni 0e3edc5018 [Bulk Tag] fixed issue in verb name 2022-11-23 11:26:36 +01:00
Claudio Atzori a79c47522d updated ORCID datasource identifier 2022-11-23 10:17:49 +01:00
Alessia Bardi 2832117f23 added eoscifguidelines in test 2022-11-22 18:01:12 +01:00
Alessia Bardi 3c08269a4d Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-11-22 17:31:00 +01:00
Alessia Bardi 2687fc9f73 tests for EOSC Future review - ROhub 2022-11-22 17:30:56 +01:00
Claudio Atzori 1d5143b0b6 Merge branch 'beta' into deduptesting 2022-11-22 10:21:30 +01:00
Claudio Atzori 0aa725083f extended dedup testing 2022-11-17 16:13:43 +01:00
Claudio Atzori 3dbc637d3e code formatting 2022-11-17 09:55:41 +01:00
Claudio Atzori ddff0e8999 merging duplicates using IdentifierComparator 2022-11-11 16:10:25 +01:00
Claudio Atzori 5af5a8ae42 added IdentifierComparator 2022-11-09 14:20:59 +01:00
Claudio Atzori 7c3390ac10 Merge branch 'beta' into eoscifguidelines-from-mdstores 2022-11-07 12:18:40 +01:00
dimitrispie 992fc5b628 Added McMaster University Institution 2022-11-03 11:02:18 +02:00
dimitrispie 7fda05e380 Added Autonomous University of Barcelona 2022-11-01 13:59:40 +02:00
Claudio Atzori 22873c9172 Merge pull request 'Added fields: totalcost, fundedamount, currency, in project table' (#257) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#257
2022-10-31 13:49:27 +01:00
dimitrispie 7861c472e0 Hive memory parameters 2022-10-28 19:00:32 +03:00
dimitrispie 5df9c63963 Added fields: totalcost, fundedamount, currency, in project table 2022-10-27 16:44:26 +03:00
Sandro La Bruzzo 2b9a20a4a3 Changed the way Scholexplorer filter the relationships, I found that filter all relation coming from openCitation is wrong, because we loose a lot of relation than intersect OpenCitation, but they don't come only from there 2022-10-24 12:53:47 +02:00
Alessia Bardi 208ed32315 fixed xpath for semantic relation 2022-10-23 18:18:13 +02:00
Alessia Bardi ee759ac92d file format after mvn compile 2022-10-23 18:09:47 +02:00
Alessia Bardi 31a10f000b Map the field oaf:eoscifguidelines from mdstores. Currently we can find it in ROHub metadata 2022-10-23 18:05:37 +02:00
Claudio Atzori ec39b84898 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-19 15:21:02 +02:00
Claudio Atzori bca4a61710 suppressing hyper verbose spark logs during unit test execution 2022-10-19 15:20:58 +02:00
Sandro La Bruzzo 72f0d88d6c formatted code 2022-10-19 14:18:42 +02:00
Claudio Atzori 9b449110c6 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-14 15:48:04 +02:00
Claudio Atzori ae7cd0735a [graph2hive] more partitions 2022-10-14 15:47:58 +02:00
Sandro La Bruzzo 135cf81151 Merge remote-tracking branch 'origin/beta' into beta 2022-10-13 11:47:25 +02:00
Sandro La Bruzzo a1f94530a3 added documentation 2022-10-13 11:47:11 +02:00
Claudio Atzori b47aaf4dd1 [cleaning] subjects declared as belonging to specific vocabularies whose values are not found in the vocab are set to type keyword 2022-10-13 11:23:43 +02:00
Claudio Atzori 6163ecbf63 [cleaning] renamed parameters in wf action 2022-10-11 11:20:03 +02:00
Claudio Atzori b301e9fdff [cleaning] renamed action name/description 2022-10-11 11:08:52 +02:00
Claudio Atzori ece40adc09 [cleaning] fixing NPE in the country cleaning phase 2022-10-11 10:10:20 +02:00