Commit Graph

5086 Commits

Author SHA1 Message Date
Miriam Baglioni eaf0a702de - 2023-11-14 14:53:34 +01:00
Sandro La Bruzzo 6ce36b3e41 Implemented ORCID Workflow on DHP-Aggregation for retrieving ORCID DUMP and generating tables 2023-11-14 12:04:29 +01:00
dimitrispie d524e30866 Changes to actionsets
Resolve comments from
#355
2023-11-14 09:46:52 +02:00
Serafeim Chatzopoulos 671ba8a5a7 Clear working dir in bipranker workflow 2023-11-07 18:35:05 +02:00
Miriam Baglioni 5bc97615d5 - 2023-11-03 15:35:10 +01:00
Miriam Baglioni 7b1e34f159 refactoring 2023-11-03 15:30:01 +01:00
Miriam Baglioni 638ad9e74f changing test for new implementation 2023-11-03 15:06:50 +01:00
Miriam Baglioni edcb17ca98 refactoring and test 2023-11-03 13:01:14 +01:00
Claudio Atzori 5f1ed61c1f merging from bulkTag branch 2023-11-03 12:51:37 +01:00
Claudio Atzori 8c03c41d5d applying changes from beta 2023-11-03 12:08:39 +01:00
Claudio Atzori 97454e9594 Merge pull request '9117_pubmed_affiliations_prod' (#357) from 9117_pubmed_affiliations_prod into master
Reviewed-on: #357
2023-11-03 11:45:34 +01:00
Serafeim Chatzopoulos 7e34dde774 Renaming input param for crossref input path 2023-11-02 17:47:04 +02:00
Serafeim Chatzopoulos 24c3f92d87 Change the description of the workflow 2023-11-02 17:46:51 +02:00
Serafeim Chatzopoulos 6ce9b600c1 Add actionset creation for pubmed affiliations 2023-11-02 17:46:39 +02:00
Serafeim Chatzopoulos 94089878fd Adjust tests to new WF input params 2023-11-02 17:46:13 +02:00
Miriam Baglioni 937ff6a7c7 - 2023-10-31 15:56:08 +01:00
Miriam Baglioni a737dd47b6 removed not needed test class 2023-10-31 15:54:49 +01:00
Miriam Baglioni c80b768af0 test for project propagation 2023-10-31 15:49:42 +01:00
Miriam Baglioni e9a20fc8f6 mergin with branch beta 2023-10-31 14:36:03 +01:00
Claudio Atzori dde2fec035 [graph cleaning] cleanup 2023-10-31 14:35:33 +01:00
Claudio Atzori 262d7c581b [graph cleaning] implemented further suggestions from https://support.openaire.eu/issues/8898 2023-10-31 14:34:10 +01:00
Serafeim Chatzopoulos 2090003ea9 Adjust tests to new WF input params 2023-10-26 13:47:06 -07:00
Miriam Baglioni 0097f4e64b Removed Query community testing. Removed package from common related to the interaction with Zenodo since it was moved to the dump-project 2023-10-26 09:38:09 +02:00
Serafeim Chatzopoulos a82aaf57b2 Renaming input param for crossref input path 2023-10-25 12:05:02 -07:00
Claudio Atzori b3a61ea955 Merge branch 'beta' into url_validation 2023-10-25 14:22:56 +02:00
dimitrispie 89c4dfbaf4 StatsDB workflow to export actionsets about OA routes, diamond, and publicly-funded
A new oozie workflow capable to read from the stats db to produce a new actionSet for updating results with:
- green_oa ={true, false}
- openAccesColor = {gold, hybrid, bronze}
- in_diamond_journal={true, false}
- publicly_funded={true, false}

Inputs:

- outputPath
- statsDB
2023-10-24 09:48:23 +03:00
Miriam Baglioni 5c5a195e97 refactoring and fixing issue on property name 2023-10-23 11:26:17 +02:00
Claudio Atzori a870aa2b09 depending on dhp-schemas:3.17.2 2023-10-20 22:28:39 +02:00
Claudio Atzori 7fc621cdec added defaults to the graph resolution workflow config-default.xml 2023-10-20 22:28:12 +02:00
Miriam Baglioni 70b78a40c7 removed file from different propagation 2023-10-20 15:50:49 +02:00
Miriam Baglioni f206ff42d6 modified code to use the the API. Removing not needed parameters. Rewritten the code to exploit the parallel stream on the entity types 2023-10-20 15:49:41 +02:00
Miriam Baglioni 34358afe75 modified resource file, workflow anf default-config. Add 3g of memory Overhead and specified the shuffle partition in the wf confiduration. Removed the multiple instantiation in the wf because of different implementation of the spark job 2023-10-20 15:48:27 +02:00
Miriam Baglioni 18bfff8af3 adding test classes and modifying test for bulktag 2023-10-20 15:47:03 +02:00
Miriam Baglioni 69dac91659 adding the new code to use the API instead of the Information Service 2023-10-20 15:45:52 +02:00
Serafeim Chatzopoulos aad5982bf1 Change the description of the workflow 2023-10-20 12:48:21 +03:00
Miriam Baglioni a9ede1e989 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2023-10-20 10:14:43 +02:00
Miriam Baglioni a4214ced1e fixing issue on propagation organization. added --config to workflow definition. added oozie_app to communtiy project 2023-10-20 10:14:20 +02:00
Serafeim Chatzopoulos 6b19dcee80 Add actionset creation for pubmed affiliations 2023-10-19 19:58:25 +03:00
Claudio Atzori 2b9d0416ec [graph raw] URL Validator to accept double slashes 2023-10-19 16:26:37 +02:00
Claudio Atzori b0fed1725e avoid NPEs 2023-10-19 12:13:45 +02:00
Miriam Baglioni f1b898c6b4 mergin with branch beta 2023-10-19 09:04:35 +02:00
Claudio Atzori a24178cb93 Merge branch 'beta' into resource_types 2023-10-17 11:09:50 +02:00
Claudio Atzori d28b7085f6 more NPE checks 2023-10-17 11:09:31 +02:00
Claudio Atzori 3b1c8b9fbd Merge pull request 'FIX: GroupEntitiesSparkJob deletes whole graph outputPath instead of its temporary folder' (#351) from fix_consistency_missing_rels into beta
Reviewed-on: #351
2023-10-17 08:40:23 +02:00
Claudio Atzori 1d594eaffd Merge branch 'beta' into fix_consistency_missing_rels 2023-10-17 08:40:07 +02:00
Giambattista Bloisi 0e44b037a5 FIX: GroupEntitiesSparkJob deletes whole graph outputPath instead of its temporary folder 2023-10-17 07:54:01 +02:00
Claudio Atzori 6dfcd0c9a2 [raw graph] mapping original resource types 2023-10-16 12:57:18 +02:00
Claudio Atzori 39d24d5469 Merge branch 'beta' into resource_types 2023-10-16 11:56:38 +02:00
Claudio Atzori 389e3fcc59 Merge pull request '[dedup] use common `saveParquet` and `save` methods to ensure outputs are compressed' (#349) from fix_dedup_not_compressed into beta
Reviewed-on: #349
2023-10-16 11:56:18 +02:00
Sandro La Bruzzo a5a89a702f new spark parrameter updated 2023-10-16 11:46:12 +02:00