Claudio Atzori
2c77638bf5
Merge branch 'beta' into cleaning_8898
2023-11-22 14:00:10 +01:00
Claudio Atzori
836d7ec724
Merge pull request 'Add Pubmed affiliations (inferred by BIP) as actionsets' ( #353 ) from 9117_pubmed_affiliations into beta
...
Reviewed-on: D-Net/dnet-hadoop#353
2023-11-22 13:53:07 +01:00
Claudio Atzori
745039ad5b
Merge branch 'beta' into 9117_pubmed_affiliations
2023-11-22 13:52:53 +01:00
Claudio Atzori
008fdf9d8a
Merge pull request 'URL Validator to accept double slashes' ( #352 ) from url_validation into beta
...
Reviewed-on: D-Net/dnet-hadoop#352
2023-11-22 13:52:08 +01:00
Claudio Atzori
11a1207f9c
[graph cleaning] applying coar based vocabularies in bulk
2023-11-22 12:22:14 +01:00
dimitrispie
a94a54a2d0
Changes for tables and creation of the new indicator indi_is_result_accessible
...
- Drop table statements for all tables to avoid duplicates in case of wf rerun
- Add pdfsaggregated step to create the indi_is_result_accessible table. This step is executed on the new impala cluster only, since the pdfaggregation_i is updated on this cluster.
2023-11-15 14:32:18 +02:00
Miriam Baglioni
eaf0a702de
-
2023-11-14 14:53:34 +01:00
Sandro La Bruzzo
6ce36b3e41
Implemented ORCID Workflow on DHP-Aggregation for retrieving ORCID DUMP and generating tables
2023-11-14 12:04:29 +01:00
dimitrispie
d524e30866
Changes to actionsets
...
Resolve comments from
D-Net/dnet-hadoop#355
2023-11-14 09:46:52 +02:00
Miriam Baglioni
5bc97615d5
-
2023-11-03 15:35:10 +01:00
Miriam Baglioni
7b1e34f159
refactoring
2023-11-03 15:30:01 +01:00
Miriam Baglioni
638ad9e74f
changing test for new implementation
2023-11-03 15:06:50 +01:00
Miriam Baglioni
edcb17ca98
refactoring and test
2023-11-03 13:01:14 +01:00
Miriam Baglioni
937ff6a7c7
-
2023-10-31 15:56:08 +01:00
Miriam Baglioni
a737dd47b6
removed not needed test class
2023-10-31 15:54:49 +01:00
Miriam Baglioni
c80b768af0
test for project propagation
2023-10-31 15:49:42 +01:00
Miriam Baglioni
e9a20fc8f6
mergin with branch beta
2023-10-31 14:36:03 +01:00
Claudio Atzori
dde2fec035
[graph cleaning] cleanup
2023-10-31 14:35:33 +01:00
Claudio Atzori
262d7c581b
[graph cleaning] implemented further suggestions from https://support.openaire.eu/issues/8898
2023-10-31 14:34:10 +01:00
Serafeim Chatzopoulos
2090003ea9
Adjust tests to new WF input params
2023-10-26 13:47:06 -07:00
Serafeim Chatzopoulos
a82aaf57b2
Renaming input param for crossref input path
2023-10-25 12:05:02 -07:00
Claudio Atzori
b3a61ea955
Merge branch 'beta' into url_validation
2023-10-25 14:22:56 +02:00
dimitrispie
89c4dfbaf4
StatsDB workflow to export actionsets about OA routes, diamond, and publicly-funded
...
A new oozie workflow capable to read from the stats db to produce a new actionSet for updating results with:
- green_oa ={true, false}
- openAccesColor = {gold, hybrid, bronze}
- in_diamond_journal={true, false}
- publicly_funded={true, false}
Inputs:
- outputPath
- statsDB
2023-10-24 09:48:23 +03:00
Claudio Atzori
a870aa2b09
depending on dhp-schemas:3.17.2
2023-10-20 22:28:39 +02:00
Claudio Atzori
7fc621cdec
added defaults to the graph resolution workflow config-default.xml
2023-10-20 22:28:12 +02:00
Serafeim Chatzopoulos
aad5982bf1
Change the description of the workflow
2023-10-20 12:48:21 +03:00
Miriam Baglioni
a4214ced1e
fixing issue on propagation organization. added --config to workflow definition. added oozie_app to communtiy project
2023-10-20 10:14:20 +02:00
Serafeim Chatzopoulos
6b19dcee80
Add actionset creation for pubmed affiliations
2023-10-19 19:58:25 +03:00
Claudio Atzori
2b9d0416ec
[graph raw] URL Validator to accept double slashes
2023-10-19 16:26:37 +02:00
Claudio Atzori
b0fed1725e
avoid NPEs
2023-10-19 12:13:45 +02:00
Miriam Baglioni
f1b898c6b4
mergin with branch beta
2023-10-19 09:04:35 +02:00
Claudio Atzori
a24178cb93
Merge branch 'beta' into resource_types
2023-10-17 11:09:50 +02:00
Claudio Atzori
d28b7085f6
more NPE checks
2023-10-17 11:09:31 +02:00
Claudio Atzori
3b1c8b9fbd
Merge pull request 'FIX: GroupEntitiesSparkJob deletes whole graph outputPath instead of its temporary folder' ( #351 ) from fix_consistency_missing_rels into beta
...
Reviewed-on: D-Net/dnet-hadoop#351
2023-10-17 08:40:23 +02:00
Claudio Atzori
1d594eaffd
Merge branch 'beta' into fix_consistency_missing_rels
2023-10-17 08:40:07 +02:00
Giambattista Bloisi
0e44b037a5
FIX: GroupEntitiesSparkJob deletes whole graph outputPath instead of its temporary folder
2023-10-17 07:54:01 +02:00
Claudio Atzori
6dfcd0c9a2
[raw graph] mapping original resource types
2023-10-16 12:57:18 +02:00
Claudio Atzori
39d24d5469
Merge branch 'beta' into resource_types
2023-10-16 11:56:38 +02:00
Claudio Atzori
389e3fcc59
Merge pull request '[dedup] use common `saveParquet` and `save` methods to ensure outputs are compressed' ( #349 ) from fix_dedup_not_compressed into beta
...
Reviewed-on: D-Net/dnet-hadoop#349
2023-10-16 11:56:18 +02:00
Sandro La Bruzzo
a5a89a702f
new spark parrameter updated
2023-10-16 11:46:12 +02:00
Miriam Baglioni
159388f9c2
testing and fix some issues
2023-10-16 11:26:07 +02:00
Claudio Atzori
03670bb9ce
[dedup] use common saveParquet and save methods to ensure outputs are compressed
2023-10-16 10:55:47 +02:00
Claudio Atzori
54fbf09ac6
[raw graph] WIP: mapping original resource types
2023-10-16 08:57:47 +02:00
Claudio Atzori
6cf64d5d8b
[SWH] renamed 'Software Heritage Identifier' to 'Software Hash Identifier'
2023-10-13 10:09:26 +02:00
Claudio Atzori
76447958bb
cleanup & docs
2023-10-12 12:23:20 +02:00
Claudio Atzori
1902728f7e
Merge pull request '[ActionManagerFramework] documentation' ( #347 ) from actionset_docs into beta
...
Reviewed-on: D-Net/dnet-hadoop#347
2023-10-12 10:07:25 +02:00
Claudio Atzori
dda602fff7
[AMF] docs
2023-10-12 10:05:46 +02:00
Claudio Atzori
05ee7d8b09
[graph cleaning] avoid NPEs
2023-10-12 09:13:42 +02:00
Miriam Baglioni
8e9493fad9
mergin with branch beta
2023-10-11 18:18:09 +02:00
Miriam Baglioni
89184d5b4f
used the API instead of the IS for bulktagging and propagation for community through organization. Added a new propagation step for communities through projects. Still using the API and not the IS
2023-10-11 18:17:35 +02:00