Commit Graph

3605 Commits

Author SHA1 Message Date
Claudio Atzori 24ef301cc1 [graph cleaning] patch the result's collectedfrom and hostedby identifiers according to the datasource master-duplicate mapping 2022-11-28 09:54:18 +01:00
Alessia Bardi 90c8f9cb61 tests for EOSC Future 2022-11-23 12:18:44 +01:00
Miriam Baglioni 0e3edc5018 [Bulk Tag] fixed issue in verb name 2022-11-23 11:26:36 +01:00
Claudio Atzori a79c47522d updated ORCID datasource identifier 2022-11-23 10:17:49 +01:00
Alessia Bardi 2832117f23 added eoscifguidelines in test 2022-11-22 18:01:12 +01:00
Alessia Bardi 3c08269a4d Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-11-22 17:31:00 +01:00
Alessia Bardi 2687fc9f73 tests for EOSC Future review - ROhub 2022-11-22 17:30:56 +01:00
Claudio Atzori 1d5143b0b6 Merge branch 'beta' into deduptesting 2022-11-22 10:21:30 +01:00
Claudio Atzori 0aa725083f extended dedup testing 2022-11-17 16:13:43 +01:00
Claudio Atzori 3dbc637d3e code formatting 2022-11-17 09:55:41 +01:00
Claudio Atzori ddff0e8999 merging duplicates using IdentifierComparator 2022-11-11 16:10:25 +01:00
Claudio Atzori 5af5a8ae42 added IdentifierComparator 2022-11-09 14:20:59 +01:00
Claudio Atzori 7c3390ac10 Merge branch 'beta' into eoscifguidelines-from-mdstores 2022-11-07 12:18:40 +01:00
dimitrispie 992fc5b628 Added McMaster University Institution 2022-11-03 11:02:18 +02:00
dimitrispie 7fda05e380 Added Autonomous University of Barcelona 2022-11-01 13:59:40 +02:00
Claudio Atzori 22873c9172 Merge pull request 'Added fields: totalcost, fundedamount, currency, in project table' (#257) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#257
2022-10-31 13:49:27 +01:00
dimitrispie 7861c472e0 Hive memory parameters 2022-10-28 19:00:32 +03:00
dimitrispie 5df9c63963 Added fields: totalcost, fundedamount, currency, in project table 2022-10-27 16:44:26 +03:00
Sandro La Bruzzo 2b9a20a4a3 Changed the way Scholexplorer filter the relationships, I found that filter all relation coming from openCitation is wrong, because we loose a lot of relation than intersect OpenCitation, but they don't come only from there 2022-10-24 12:53:47 +02:00
Alessia Bardi 208ed32315 fixed xpath for semantic relation 2022-10-23 18:18:13 +02:00
Alessia Bardi ee759ac92d file format after mvn compile 2022-10-23 18:09:47 +02:00
Alessia Bardi 31a10f000b Map the field oaf:eoscifguidelines from mdstores. Currently we can find it in ROHub metadata 2022-10-23 18:05:37 +02:00
Claudio Atzori ec39b84898 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-19 15:21:02 +02:00
Claudio Atzori bca4a61710 suppressing hyper verbose spark logs during unit test execution 2022-10-19 15:20:58 +02:00
Sandro La Bruzzo 72f0d88d6c formatted code 2022-10-19 14:18:42 +02:00
Claudio Atzori 9b449110c6 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-14 15:48:04 +02:00
Claudio Atzori ae7cd0735a [graph2hive] more partitions 2022-10-14 15:47:58 +02:00
Sandro La Bruzzo 135cf81151 Merge remote-tracking branch 'origin/beta' into beta 2022-10-13 11:47:25 +02:00
Sandro La Bruzzo a1f94530a3 added documentation 2022-10-13 11:47:11 +02:00
Claudio Atzori b47aaf4dd1 [cleaning] subjects declared as belonging to specific vocabularies whose values are not found in the vocab are set to type keyword 2022-10-13 11:23:43 +02:00
Claudio Atzori 6163ecbf63 [cleaning] renamed parameters in wf action 2022-10-11 11:20:03 +02:00
Claudio Atzori b301e9fdff [cleaning] renamed action name/description 2022-10-11 11:08:52 +02:00
Claudio Atzori ece40adc09 [cleaning] fixing NPE in the country cleaning phase 2022-10-11 10:10:20 +02:00
Claudio Atzori d51275a965 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-10-07 09:52:49 +02:00
Claudio Atzori 8d97949316 [cleaning] fixed loop in wf nodes 2022-10-07 09:52:45 +02:00
Miriam Baglioni 4d8339614b Revert "[BipFinder] Fixed issue for wrong escaped char in doi"
This reverts commit 188f25eefa.
2022-10-04 14:29:47 +02:00
Miriam Baglioni 7324853a17 Revert "[BipFinder] refactoring"
This reverts commit 28dc317350.
2022-10-04 14:29:39 +02:00
Miriam Baglioni 28dc317350 [BipFinder] refactoring 2022-10-04 09:47:27 +02:00
Miriam Baglioni 188f25eefa [BipFinder] Fixed issue for wrong escaped char in doi 2022-10-03 12:42:52 +02:00
Claudio Atzori 89f7007080 Merge pull request '[stats wf] misc changes' (#254) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#254
2022-10-03 10:32:05 +02:00
dimitrispie 2c0c3f1806 Cast amount to float for table result_apcs 2022-09-28 19:33:24 +03:00
Alessia Bardi 49360770d7 map w3id as instance url 2022-09-28 14:16:39 +02:00
dimitrispie bdc46e3eaa Remove denormalization of results to fix downloads numbers in monitor 2022-09-28 14:59:08 +03:00
dimitrispie 2ebb1459a9 Fixed type in no_downloads 2022-09-28 14:36:57 +03:00
Miriam Baglioni b5b5a4c192 [CleanCountry] fixed issue 2022-09-28 12:42:51 +02:00
Miriam Baglioni f1d7d45cf7 [BulkTag] fixed issue 2022-09-28 12:01:43 +02:00
Miriam Baglioni 3ec044600d [BulkTag] fixed conflicts 2022-09-28 11:58:28 +02:00
Miriam Baglioni 1cb79719a7 [BulkTag] fixed issues 2022-09-28 11:44:55 +02:00
Claudio Atzori f3f7604e6c trying to fix a test that fails only on Jenkins 2022-09-27 15:21:37 +02:00
Claudio Atzori 3f90d159e3 code formatting 2022-09-27 15:08:00 +02:00
Claudio Atzori 0b3e44e521 Merge branch 'beta' into relation-from-odf 2022-09-27 14:57:01 +02:00
Claudio Atzori 57dbeb08d2 code formatting 2022-09-27 14:55:10 +02:00
Claudio Atzori b60985cf68 Merge branch 'beta' into horizontalConstraints 2022-09-27 14:39:31 +02:00
Claudio Atzori 3b60642ef9 Merge pull request 'Synchronize indicators in stats-db with monitor-db' (#249) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#249
2022-09-27 14:37:33 +02:00
Claudio Atzori 25e9d92aad Merge branch 'beta' into clean_country 2022-09-27 14:27:49 +02:00
Alessia Bardi fd63e9bfac Mapping all relationships supported in ModelConstants and ModelSupport 2022-09-26 11:24:13 +02:00
Miriam Baglioni ca216a92ad [BulkTagging] changed the query to the IS to insert values for FOS and SDG as subject in the configuration used for the tagging 2022-09-23 17:06:07 +02:00
Miriam Baglioni 3e6b0f58bb [BulkTagging] changed the query to the IS to get also the information for the advancedConstraint from the profile 2022-09-23 16:47:19 +02:00
Miriam Baglioni 4a3e119b73 mergin with branch beta 2022-09-23 16:16:06 +02:00
Miriam Baglioni f0e303abf9 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-09-23 16:15:32 +02:00
Miriam Baglioni 55da4d8715 [BulkTagging] modifying code to represent constraints horizontally on all the results. Added subject to the set of field used to express the constraint. Modified resorces to test the new approach. Modified test calss 2022-09-23 16:02:19 +02:00
Alessia Bardi c5eb722170 relationships from relatedIdentifier whose target id type is one of the pid type with an authority 2022-09-23 15:47:05 +02:00
Claudio Atzori c86cc53520 suppressing hyper verbose spark logs during unit test execution 2022-09-23 15:20:40 +02:00
Alessia Bardi ba33ff71fd refactoring for the generation of relationships from related identifier of type 'OPENAIRE' 2022-09-23 15:17:13 +02:00
Alessia Bardi 982bcc1e35 test wrid pid and record identifier 2022-09-23 12:06:06 +02:00
Miriam Baglioni 960cb861a0 refactoring 2022-09-23 11:14:04 +02:00
Claudio Atzori c42850328e fixed semantic (subreltype) for ServiceOrganization relations 2022-09-22 16:23:25 +02:00
Miriam Baglioni 33bb79459e Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2022-09-22 15:55:17 +02:00
dimitrispie dcd85f8cd7 - Synchronize indicators in stats-db with monitor-db
- added new openorg id for Nanyang Technological University
- changed openorg id for University of Helsinki #8088 ticket
2022-09-22 13:33:07 +03:00
Claudio Atzori e45ec15221 Merge branch 'beta' into clean_country 2022-09-19 11:34:02 +02:00
Claudio Atzori 26e1badded added instance.url syntactical validation, avoid creating multiple duplicated URLs 2022-09-19 11:19:10 +02:00
Miriam Baglioni 5240ac3d7b [EOSC Tag] remove addition of eosc context for result with eosc if guidelines set 2022-09-19 11:02:18 +02:00
Claudio Atzori 192215a18e merged from branch discard-non-wellformed 2022-09-19 10:17:10 +02:00
Claudio Atzori e370e940d8 [aggregator graph] save invalid records aside for further inspection 2022-09-16 14:06:28 +02:00
Claudio Atzori 465e941214 Merge pull request '[stats wf] Changes to indicators tables' (#244) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#244
2022-09-16 10:13:58 +02:00
Claudio Atzori 1e42d984e1 [aggregator graph] save invalid records aside for further inspection 2022-09-15 10:49:42 +02:00
Alessia Bardi 9e7ec4198f fixed test 2022-09-14 18:08:56 +02:00
Claudio Atzori c48f6e9c57 [aggregator graph] save invalid records aside for further inspection 2022-09-14 17:11:26 +02:00
dimitrispie 3bf3127251 Changes to monitor and indicator scripts 2022-09-14 16:36:19 +03:00
Claudio Atzori a0919ed495 [aggregator graph] save invalid records aside for further inspection 2022-09-14 13:27:39 +02:00
Alessia Bardi b99a011345 return empty Oaf list if record cannot be parsed 2022-09-13 11:51:55 +02:00
Alessia Bardi 27af5122d2 logs for non well formed XML files 2022-09-12 14:25:23 +02:00
Claudio Atzori ff6f789b6d code formatting 2022-09-09 15:16:31 +02:00
Claudio Atzori b5d6966c01 Merge branch 'beta' into clean_country 2022-09-09 12:20:19 +02:00
Claudio Atzori b5f7bd30be Merge branch 'beta' into clean_subjects 2022-09-09 12:20:04 +02:00
Alessia Bardi f14107ad77 Merge branch 'handle_as_instance_urls' of https://code-repo.d4science.org/D-Net/dnet-hadoop into handle_as_instance_urls 2022-09-09 12:17:19 +02:00
Alessia Bardi a539c6ccaf https for handle URLs 2022-09-09 12:16:28 +02:00
dimitrispie 71b069ca90 Changes to indicator and monitor scripts 2022-09-09 13:15:58 +03:00
Claudio Atzori 1203378441 Merge branch 'beta' into clean_subjects 2022-09-09 10:38:47 +02:00
Claudio Atzori 14dc909a14 Merge branch 'beta' into clean_country 2022-09-09 10:38:17 +02:00
Claudio Atzori 853c996fa2 Merge branch 'beta' into handle_as_instance_urls 2022-09-09 09:47:16 +02:00
Claudio Atzori a431e01383 Merge pull request 'orcid_multipleworks_download' (#242) from enrico.ottonello/dnet-hadoop:orcid_multipleworks_download into beta
Reviewed-on: D-Net/dnet-hadoop#242
2022-09-09 08:45:02 +02:00
Alessia Bardi 9ef063d502 #7861#note-8 instance url from handle 2022-09-07 17:29:54 +03:00
Alessia Bardi 5c45d52af3 testing for RiuNet 2022-09-07 15:40:57 +03:00
dimitrispie 2b5f8c9c9a comment out duplicate table creation 2022-09-06 12:27:53 +03:00
Alessia Bardi a11eb38065 testing for RO-Hub 2022-09-02 16:07:36 +02:00
Enrico Ottonello bfdf2dc390 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid_multipleworks_download 2022-08-25 12:07:54 +02:00
Enrico Ottonello da1cf561e6 alignment with beta 2022-08-25 11:57:20 +02:00
Enrico Ottonello 27445ccdaa cleaned log 2022-08-25 11:56:14 +02:00
Claudio Atzori b7c387c21f cleaning of subjects: avoid duplicated subjects, prioritise collected vs inferred or other sources 2022-08-12 15:09:16 +02:00