Commit Graph

4072 Commits

Author SHA1 Message Date
Claudio Atzori c381bacee0 [enrichment] passing the community API base URL 2023-12-07 14:07:11 +01:00
Miriam Baglioni 336fb31d87 [community_result_propagation] adjusting starting poit of workflow 2023-12-07 10:27:25 +01:00
Miriam Baglioni c0cde53bf6 [bulktagging] setting first step of bulktaggin as the copy of the entities and relations not involved in the tagging' 2023-12-07 10:08:35 +01:00
Miriam Baglioni 616622d2bb first version of the workflow single step 2023-12-07 09:59:52 +01:00
Claudio Atzori 259c69e446 [orcid enrichment] fixed workflow definition 2023-12-06 19:41:53 +01:00
Claudio Atzori 431c6bb08a [dedup] added isLookupUrl to the graph consistency workflow definition, required now by the entity grouping phase 2023-12-06 11:06:46 +01:00
Claudio Atzori 321922772b added serialization for the new fields imported for the Irish tender 2023-12-05 16:37:04 +01:00
Claudio Atzori c5b7253130 [community_organization propagation] fixed workflow parameters 2023-12-05 09:13:33 +01:00
Claudio Atzori 3c3bdb8318 [bulktagging] fixed workflow parameters 2023-12-05 09:08:48 +01:00
Claudio Atzori 2a233a89aa [graph grouping] added isLookupUrl to the workflow definition, passed to the grouping spark aciton 2023-12-03 13:32:52 +01:00
Claudio Atzori 178a14c491 code formatting 2023-12-03 13:31:58 +01:00
Sandro La Bruzzo 3caf6ff27e Extracted the correct original type to pass to instanceTypeMapping in Crossref Mapping 2023-12-01 16:33:56 +01:00
Claudio Atzori 511a98dd80 fixed doiboost process workflow, removed references to the ProcessORCID step 2023-12-01 16:21:53 +01:00
Claudio Atzori 09d061e90b Merge branch 'beta' into orcid_import 2023-12-01 15:05:35 +01:00
Claudio Atzori 93a700742a Merge pull request 'Changes for tables and creation of the new indicator indi_is_result_accessible' (#363) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #363
2023-12-01 15:05:23 +01:00
Claudio Atzori 0c3c9ea43d Merge pull request 'StatsDB workflow to export actionsets about OA routes, diamond, and publicly-funded' (#355) from dimitris.pierrakos/dnet-hadoop:beta into beta
Reviewed-on: #355
2023-12-01 15:03:56 +01:00
Claudio Atzori 33cb483c75 using objectSubType as originalType in Crossref2Oaf, code formatting 2023-12-01 15:03:05 +01:00
dimitrispie c9d995dde0 New institutions added 2023-12-01 15:44:35 +02:00
dimitrispie a397112cb8 Add new indicator
Add indi_pub_publicly_funded
2023-12-01 15:00:18 +02:00
dimitrispie 76594ded23 Changes to indicators
Fixes on open access colours indicators
- indi_pub_green_oa
- indi_pub_gold_oa
- indi_pub_hybrid
- indi_pub_bronze_oa
- indi_pub_diamond
2023-12-01 13:38:19 +02:00
Claudio Atzori 622fafbd2e Merge branch 'beta' into orcid_import 2023-12-01 12:28:14 +01:00
Sandro La Bruzzo bf0fd27c36 Removed unused function
Applied PR Comment of Giambattista in the PR
2023-12-01 12:16:42 +01:00
dimitrispie 48430a32a6 Update StatsAtomicActionsJob.java
Added indi_funded_result_with_fundref indicator
2023-12-01 11:35:01 +02:00
Sandro La Bruzzo cdfb7588dd code formatting 2023-11-30 15:31:42 +01:00
Sandro La Bruzzo 5e22b67b8a Merge remote-tracking branch 'origin/beta' into orcid_import 2023-11-30 15:27:46 +01:00
Sandro La Bruzzo f718caaac9 Added copy of the untouched entities of the graph 2023-11-30 14:51:00 +01:00
Sandro La Bruzzo 7b5e04f37e removed Orcid intersection on DOIBoost 2023-11-30 14:36:50 +01:00
Claudio Atzori 6f10791e77 Merge branch 'beta' into propagationapi 2023-11-30 14:20:18 +01:00
Claudio Atzori 4e1aac2e2f resolved conflict in pom.xml before applying the changes from [COAR based resource types & Irish tender] #350 2023-11-29 14:37:52 +01:00
Sandro La Bruzzo 86b5775e08 added vocabulary in instanceTypeMapping for
- DOIBoost
- Datacite
- PubMed
- Scholexplorer Datasource
2023-11-29 13:15:43 +01:00
Sandro La Bruzzo c96ff54b45 Merge remote-tracking branch 'origin/resource_types' into resource_types 2023-11-29 12:45:41 +01:00
Sandro La Bruzzo af1c2634b3 added instanceTypeMapping original field in the mapping of
- DOIBoost
- Datacite
- PubMed
- Scholexplorer Datasource
2023-11-29 12:45:30 +01:00
Sandro La Bruzzo 279100fa52 added test 2023-11-29 11:17:58 +01:00
Sandro La Bruzzo 59111713fa added comment 2023-11-28 09:00:48 +01:00
Sandro La Bruzzo 6f4d0c05ea Implemented Author MErger for ORCID that takes in account the case when name and surname are swapped 2023-11-28 08:43:56 +01:00
Miriam Baglioni 8eb70e6657 refactoring 2023-11-27 15:13:15 +01:00
Miriam Baglioni e3cce9a5a0 mergin with branch beta 2023-11-27 15:10:55 +01:00
Miriam Baglioni 48e0427a23 changed the parameter from production to baseURL. Fixed issue in tagging configuration 2023-11-27 15:10:27 +01:00
Sandro La Bruzzo 34a4b3cbdf Implemented ORCID Enrichment 2023-11-24 12:39:58 +01:00
dimitrispie 359e81b7a6 Update StatsAtomicActionsJob.java
Bug fix for duplicate bronze checks
2023-11-23 10:48:55 +02:00
Claudio Atzori 2c77638bf5 Merge branch 'beta' into cleaning_8898 2023-11-22 14:00:10 +01:00
Claudio Atzori 745039ad5b Merge branch 'beta' into 9117_pubmed_affiliations 2023-11-22 13:52:53 +01:00
Claudio Atzori 11a1207f9c [graph cleaning] applying coar based vocabularies in bulk 2023-11-22 12:22:14 +01:00
dimitrispie a94a54a2d0 Changes for tables and creation of the new indicator indi_is_result_accessible
- Drop table statements for all tables to avoid duplicates in case of wf rerun
- Add pdfsaggregated step to create the indi_is_result_accessible table. This step is executed on the new impala cluster only, since the pdfaggregation_i is updated on this cluster.
2023-11-15 14:32:18 +02:00
Miriam Baglioni eaf0a702de - 2023-11-14 14:53:34 +01:00
Sandro La Bruzzo 6ce36b3e41 Implemented ORCID Workflow on DHP-Aggregation for retrieving ORCID DUMP and generating tables 2023-11-14 12:04:29 +01:00
dimitrispie d524e30866 Changes to actionsets
Resolve comments from
#355
2023-11-14 09:46:52 +02:00
Miriam Baglioni 5bc97615d5 - 2023-11-03 15:35:10 +01:00
Miriam Baglioni 7b1e34f159 refactoring 2023-11-03 15:30:01 +01:00
Miriam Baglioni 638ad9e74f changing test for new implementation 2023-11-03 15:06:50 +01:00
Miriam Baglioni edcb17ca98 refactoring and test 2023-11-03 13:01:14 +01:00
Miriam Baglioni 937ff6a7c7 - 2023-10-31 15:56:08 +01:00
Miriam Baglioni a737dd47b6 removed not needed test class 2023-10-31 15:54:49 +01:00
Miriam Baglioni c80b768af0 test for project propagation 2023-10-31 15:49:42 +01:00
Miriam Baglioni e9a20fc8f6 mergin with branch beta 2023-10-31 14:36:03 +01:00
Claudio Atzori 262d7c581b [graph cleaning] implemented further suggestions from https://support.openaire.eu/issues/8898 2023-10-31 14:34:10 +01:00
Serafeim Chatzopoulos 2090003ea9 Adjust tests to new WF input params 2023-10-26 13:47:06 -07:00
Serafeim Chatzopoulos a82aaf57b2 Renaming input param for crossref input path 2023-10-25 12:05:02 -07:00
Claudio Atzori b3a61ea955 Merge branch 'beta' into url_validation 2023-10-25 14:22:56 +02:00
dimitrispie 89c4dfbaf4 StatsDB workflow to export actionsets about OA routes, diamond, and publicly-funded
A new oozie workflow capable to read from the stats db to produce a new actionSet for updating results with:
- green_oa ={true, false}
- openAccesColor = {gold, hybrid, bronze}
- in_diamond_journal={true, false}
- publicly_funded={true, false}

Inputs:

- outputPath
- statsDB
2023-10-24 09:48:23 +03:00
Claudio Atzori 7fc621cdec added defaults to the graph resolution workflow config-default.xml 2023-10-20 22:28:12 +02:00
Serafeim Chatzopoulos aad5982bf1 Change the description of the workflow 2023-10-20 12:48:21 +03:00
Miriam Baglioni a4214ced1e fixing issue on propagation organization. added --config to workflow definition. added oozie_app to communtiy project 2023-10-20 10:14:20 +02:00
Serafeim Chatzopoulos 6b19dcee80 Add actionset creation for pubmed affiliations 2023-10-19 19:58:25 +03:00
Claudio Atzori 2b9d0416ec [graph raw] URL Validator to accept double slashes 2023-10-19 16:26:37 +02:00
Claudio Atzori b0fed1725e avoid NPEs 2023-10-19 12:13:45 +02:00
Miriam Baglioni f1b898c6b4 mergin with branch beta 2023-10-19 09:04:35 +02:00
Claudio Atzori 6dfcd0c9a2 [raw graph] mapping original resource types 2023-10-16 12:57:18 +02:00
Claudio Atzori 39d24d5469 Merge branch 'beta' into resource_types 2023-10-16 11:56:38 +02:00
Sandro La Bruzzo a5a89a702f new spark parrameter updated 2023-10-16 11:46:12 +02:00
Miriam Baglioni 159388f9c2 testing and fix some issues 2023-10-16 11:26:07 +02:00
Claudio Atzori 03670bb9ce [dedup] use common saveParquet and save methods to ensure outputs are compressed 2023-10-16 10:55:47 +02:00
Claudio Atzori 54fbf09ac6 [raw graph] WIP: mapping original resource types 2023-10-16 08:57:47 +02:00
Claudio Atzori 6cf64d5d8b [SWH] renamed 'Software Heritage Identifier' to 'Software Hash Identifier' 2023-10-13 10:09:26 +02:00
Claudio Atzori 76447958bb cleanup & docs 2023-10-12 12:23:20 +02:00
Claudio Atzori dda602fff7 [AMF] docs 2023-10-12 10:05:46 +02:00
Miriam Baglioni 8e9493fad9 mergin with branch beta 2023-10-11 18:18:09 +02:00
Miriam Baglioni 89184d5b4f used the API instead of the IS for bulktagging and propagation for community through organization. Added a new propagation step for communities through projects. Still using the API and not the IS 2023-10-11 18:17:35 +02:00
Claudio Atzori 554551682d [raw graph] adopting the new COAR based vocabularies for the resource typing 2023-10-11 16:09:19 +02:00
Claudio Atzori a460ebe215 [UnresolvedEntities] updated action name 2023-10-10 15:50:11 +02:00
Claudio Atzori 66064e99fe Merge branch 'beta' into fos 2023-10-10 15:07:21 +02:00
Miriam Baglioni a431b04814 leftover for the properties and removal of bipfinder 2023-10-10 12:53:57 +02:00
Claudio Atzori ed9282ef2a removed module dhp-stats-monitor-update 2023-10-10 09:52:03 +02:00
Miriam Baglioni 110ce4b40f extend the fos model to include the level4 and the scores for level3 and level4. removed bip indicators from the instance 2023-10-10 09:46:40 +02:00
Claudio Atzori 204404b0e3 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-10-10 09:36:13 +02:00
Claudio Atzori 9a98f408b3 code formatting 2023-10-10 09:36:11 +02:00
Claudio Atzori 4e6fccf4f6 Merge pull request 'Beta stats wf updated' (#332) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #332
2023-10-10 09:35:32 +02:00
Miriam Baglioni a3d01ccb24 refactoring 2023-10-09 14:52:17 +02:00
Miriam Baglioni 8448b9ebfb mergin with branch beta 2023-10-09 14:27:23 +02:00
Miriam Baglioni 3d6be20989 changes to use the API instead of the IS the get the information for the communities to be used during bulktagging and context propagation 2023-10-09 14:26:33 +02:00
dimitrispie 17586f0ff8 Update step20-createMonitorDB.sql
Add result_orcid table to monitor dbs
2023-10-09 14:21:31 +03:00
dimitrispie 489a082f04 Update step16-createIndicatorsTables.sql
Change scripts for gold, hybrid, bronze indicators
2023-10-09 14:00:50 +03:00
Claudio Atzori ef833840c3 [Doiboost] removed linkage to SFI unidentified project 2023-10-06 15:48:18 +02:00
Claudio Atzori 84a58802ab [OC] using the common pid cleaning function 2023-10-06 14:48:05 +02:00
Claudio Atzori 46034630cf [OC] compress the output actionset 2023-10-06 14:42:02 +02:00
Claudio Atzori 3bc44fbf1d Merge branch 'beta' into irish_funder 2023-10-06 14:26:41 +02:00
Claudio Atzori 3c23d5f9bc Merge branch 'beta' into SWH_integration 2023-10-06 14:15:38 +02:00
Claudio Atzori 858931ccb6 [SWH] compress the output actionset 2023-10-06 14:03:33 +02:00
Claudio Atzori f759b18bca [SWH] aligned parameter name 2023-10-06 13:43:20 +02:00
Claudio Atzori eed9fe0902 code formatting 2023-10-06 12:31:17 +02:00