Commit Graph

32 Commits

Author SHA1 Message Date
Giambattista Bloisi 9e69ded5ef Copy relations in ORCID enrichment DAG 2024-11-06 16:07:34 +01:00
Giambattista Bloisi 13ac9767c6 Orcid propagation step 2024-11-06 15:46:24 +01:00
Giambattista Bloisi c0788fcd10 Add resolvereletion step 2024-10-29 14:49:26 +01:00
Giambattista Bloisi 004be2e97f Add resolvereletion step 2024-10-29 14:43:35 +01:00
Giambattista Bloisi 9058dbb957 Clean DAG 2024-10-28 21:23:29 +01:00
Giambattista Bloisi b01331d4d0 Clean DAG 2024-10-28 21:11:37 +01:00
Giambattista Bloisi 7af06fbda5 Clean DAG 2024-10-28 21:10:38 +01:00
Giambattista Bloisi 6c25db9ac2 Clean DAG 2024-10-28 21:05:35 +01:00
Giambattista Bloisi b3d7dda0c1 DAG to build the graph from a delta 2024-10-28 17:07:46 +01:00
Giambattista Bloisi 1bd836b88a Provide paths as dag configuration parameters 2024-10-22 21:15:16 +02:00
Giambattista Bloisi 15ba3cf202 Provide paths as dag configuration parameters 2024-10-22 10:18:03 +02:00
Giambattista Bloisi 73e78d6877 Add workflow with all graph construction steps 2024-10-21 21:36:20 +02:00
Giambattista Bloisi aae37058f7 Increase memory 2024-10-21 20:30:31 +02:00
Giambattista Bloisi 131f6e5592 enable dynamic allocation 2024-10-21 20:23:31 +02:00
Giambattista Bloisi df46c8c65f Added ORCID enrichment workflows 2024-10-21 18:55:41 +02:00
Giambattista Bloisi 034a01542a Implement consistency workflow 2024-10-21 15:33:55 +02:00
Giambattista Bloisi c6fbfd3f0a Remove numpartitions argument where not needed 2024-10-21 14:30:40 +02:00
Giambattista Bloisi ae89274ce4 implemente whole scan pipeline 2024-10-21 14:10:28 +02:00
Giambattista Bloisi 0a2956d81f reduce executor cores 2024-10-19 17:59:35 +02:00
Giambattista Bloisi 48f688cda9 add deps jar 2024-10-19 11:13:21 +02:00
Giambattista Bloisi c5f4263061 update spark-version 2024-10-19 11:09:24 +02:00
Giambattista Bloisi ba3f351736 print existing files 2024-10-19 10:26:18 +02:00
Giambattista Bloisi 448bb924ab add test dedup task 2024-10-19 00:18:00 +02:00
Giambattista Bloisi bf7c9e2dce revert some changes 2024-10-18 17:16:37 +02:00
Giambattista Bloisi 0fcabed2ae change dag name 2024-10-18 16:58:42 +02:00
Giambattista Bloisi c3ba29e4c5 Add dagutils 2024-10-18 16:53:14 +02:00
Giambattista Bloisi 412e008df7 Add untar task 2024-10-18 16:42:54 +02:00
Sandro La Bruzzo d1afcd4395 fixed import 2024-10-16 14:08:00 +02:00
Sandro La Bruzzo dcd2efd3b4 added workflow test 2024-10-16 13:56:50 +02:00
Sandro La Bruzzo 6b555b8f6e added workflow test 2024-10-16 13:56:36 +02:00
Sandro La Bruzzo b8bf21f8e5 fixed import 2024-10-16 13:51:49 +02:00
Sandro La Bruzzo 07ce192207 added workflow test 2024-10-16 13:38:26 +02:00