Commit Graph

5160 Commits

Author SHA1 Message Date
Miriam Baglioni 5c5a195e97 refactoring and fixing issue on property name 2023-10-23 11:26:17 +02:00
Claudio Atzori a870aa2b09 depending on dhp-schemas:3.17.2 2023-10-20 22:28:39 +02:00
Claudio Atzori 7fc621cdec added defaults to the graph resolution workflow config-default.xml 2023-10-20 22:28:12 +02:00
Miriam Baglioni 70b78a40c7 removed file from different propagation 2023-10-20 15:50:49 +02:00
Miriam Baglioni f206ff42d6 modified code to use the the API. Removing not needed parameters. Rewritten the code to exploit the parallel stream on the entity types 2023-10-20 15:49:41 +02:00
Miriam Baglioni 34358afe75 modified resource file, workflow anf default-config. Add 3g of memory Overhead and specified the shuffle partition in the wf confiduration. Removed the multiple instantiation in the wf because of different implementation of the spark job 2023-10-20 15:48:27 +02:00
Miriam Baglioni 18bfff8af3 adding test classes and modifying test for bulktag 2023-10-20 15:47:03 +02:00
Miriam Baglioni 69dac91659 adding the new code to use the API instead of the Information Service 2023-10-20 15:45:52 +02:00
Serafeim Chatzopoulos aad5982bf1 Change the description of the workflow 2023-10-20 12:48:21 +03:00
Miriam Baglioni a9ede1e989 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2023-10-20 10:14:43 +02:00
Miriam Baglioni a4214ced1e fixing issue on propagation organization. added --config to workflow definition. added oozie_app to communtiy project 2023-10-20 10:14:20 +02:00
Serafeim Chatzopoulos 6b19dcee80 Add actionset creation for pubmed affiliations 2023-10-19 19:58:25 +03:00
Claudio Atzori 2b9d0416ec [graph raw] URL Validator to accept double slashes 2023-10-19 16:26:37 +02:00
Claudio Atzori b0fed1725e avoid NPEs 2023-10-19 12:13:45 +02:00
Miriam Baglioni f1b898c6b4 mergin with branch beta 2023-10-19 09:04:35 +02:00
Claudio Atzori a24178cb93 Merge branch 'beta' into resource_types 2023-10-17 11:09:50 +02:00
Claudio Atzori d28b7085f6 more NPE checks 2023-10-17 11:09:31 +02:00
Claudio Atzori 3b1c8b9fbd Merge pull request 'FIX: GroupEntitiesSparkJob deletes whole graph outputPath instead of its temporary folder' (#351) from fix_consistency_missing_rels into beta
Reviewed-on: #351
2023-10-17 08:40:23 +02:00
Claudio Atzori 1d594eaffd Merge branch 'beta' into fix_consistency_missing_rels 2023-10-17 08:40:07 +02:00
Giambattista Bloisi 0e44b037a5 FIX: GroupEntitiesSparkJob deletes whole graph outputPath instead of its temporary folder 2023-10-17 07:54:01 +02:00
Claudio Atzori 6dfcd0c9a2 [raw graph] mapping original resource types 2023-10-16 12:57:18 +02:00
Claudio Atzori 39d24d5469 Merge branch 'beta' into resource_types 2023-10-16 11:56:38 +02:00
Claudio Atzori 389e3fcc59 Merge pull request '[dedup] use common `saveParquet` and `save` methods to ensure outputs are compressed' (#349) from fix_dedup_not_compressed into beta
Reviewed-on: #349
2023-10-16 11:56:18 +02:00
Sandro La Bruzzo a5a89a702f new spark parrameter updated 2023-10-16 11:46:12 +02:00
Miriam Baglioni 159388f9c2 testing and fix some issues 2023-10-16 11:26:07 +02:00
Claudio Atzori 03670bb9ce [dedup] use common saveParquet and save methods to ensure outputs are compressed 2023-10-16 10:55:47 +02:00
Claudio Atzori 54fbf09ac6 [raw graph] WIP: mapping original resource types 2023-10-16 08:57:47 +02:00
Claudio Atzori 6cf64d5d8b [SWH] renamed 'Software Heritage Identifier' to 'Software Hash Identifier' 2023-10-13 10:09:26 +02:00
Claudio Atzori 242d647146 cleanup & docs 2023-10-12 12:23:44 +02:00
Claudio Atzori 76447958bb cleanup & docs 2023-10-12 12:23:20 +02:00
Claudio Atzori af3ffad6c4 [AMF] docs 2023-10-12 10:07:52 +02:00
Claudio Atzori 1902728f7e Merge pull request '[ActionManagerFramework] documentation' (#347) from actionset_docs into beta
Reviewed-on: #347
2023-10-12 10:07:25 +02:00
Claudio Atzori dda602fff7 [AMF] docs 2023-10-12 10:05:46 +02:00
Claudio Atzori 05ee7d8b09 [graph cleaning] avoid NPEs 2023-10-12 09:13:42 +02:00
Miriam Baglioni 8e9493fad9 mergin with branch beta 2023-10-11 18:18:09 +02:00
Miriam Baglioni 89184d5b4f used the API instead of the IS for bulktagging and propagation for community through organization. Added a new propagation step for communities through projects. Still using the API and not the IS 2023-10-11 18:17:35 +02:00
Claudio Atzori 554551682d [raw graph] adopting the new COAR based vocabularies for the resource typing 2023-10-11 16:09:19 +02:00
Claudio Atzori a460ebe215 [UnresolvedEntities] updated action name 2023-10-10 15:50:11 +02:00
Claudio Atzori ecea58a41c Merge pull request '[UnresolvedEntities] changing in the creation of the unresolved entities' (#346) from fos into beta
Reviewed-on: #346
2023-10-10 15:10:21 +02:00
Claudio Atzori 66064e99fe Merge branch 'beta' into fos 2023-10-10 15:07:21 +02:00
Miriam Baglioni a431b04814 leftover for the properties and removal of bipfinder 2023-10-10 12:53:57 +02:00
Claudio Atzori ed9282ef2a removed module dhp-stats-monitor-update 2023-10-10 09:52:03 +02:00
Miriam Baglioni 110ce4b40f extend the fos model to include the level4 and the scores for level3 and level4. removed bip indicators from the instance 2023-10-10 09:46:40 +02:00
Claudio Atzori 204404b0e3 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2023-10-10 09:36:13 +02:00
Claudio Atzori 9a98f408b3 code formatting 2023-10-10 09:36:11 +02:00
Claudio Atzori 4e6fccf4f6 Merge pull request 'Beta stats wf updated' (#332) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #332
2023-10-10 09:35:32 +02:00
Miriam Baglioni a3d01ccb24 refactoring 2023-10-09 14:52:17 +02:00
Miriam Baglioni 8448b9ebfb mergin with branch beta 2023-10-09 14:27:23 +02:00
Miriam Baglioni 3d6be20989 changes to use the API instead of the IS the get the information for the communities to be used during bulktagging and context propagation 2023-10-09 14:26:33 +02:00
dimitrispie 17586f0ff8 Update step20-createMonitorDB.sql
Add result_orcid table to monitor dbs
2023-10-09 14:21:31 +03:00