Oalex #13

Merged
mkallipo merged 15 commits from openaire-workflow-ready_2 into openaire-workflow-ready 2024-12-09 18:51:24 +01:00

15 Commits

Author SHA1 Message Date
Miriam Baglioni ad691c28c2 [oalex] change to add a thread to monitor the number of operations done by affro up to a certain point 2024-12-06 10:19:53 +01:00
Miriam Baglioni 2806511e02 [oalex] change collec_list to collect_set so that the same match will be there just one time 2024-12-05 21:26:08 +01:00
Miriam Baglioni 0043e4051f [oalex] renaming 2024-12-05 18:44:06 +01:00
Miriam Baglioni a59d0ce9fc [oalex] avoid redefinition of explode function 2024-12-05 18:41:16 +01:00
Miriam Baglioni e2f8007433 [oalex] added fix 2024-12-05 16:50:10 +01:00
Miriam Baglioni f8479083f2 [oalex] pasing the schema to avoid changing in confidence type 2024-12-05 16:44:17 +01:00
Miriam Baglioni 9440f863c9 [oalex] changed implementation passing throguh rdd to avoi calling udf function 2024-12-05 16:36:38 +01:00
Miriam Baglioni f78456288c [oalex] fix issue 2024-12-05 12:54:10 +01:00
Miriam Baglioni 997f2e492f [oalex] change the call of the function in the dataframe 2024-12-05 12:35:59 +01:00
Miriam Baglioni 982a1b0b9f [oalex] change the call of the function in the dataframe 2024-12-05 12:21:21 +01:00
Miriam Baglioni 4fe3d31ed5 [oalex] register the UDF oalex_affro and the schema of the output to be used in the dataframe by pyspark 2024-12-05 12:18:45 +01:00
Miriam Baglioni efa4db4e52 [oalex] execute affRo on distinct affilitaion_strings 2024-12-05 12:02:40 +01:00
Miriam Baglioni ea2e27a9f4 [oalex] fix python syntax errors 2024-12-05 11:22:10 +01:00
Miriam Baglioni e33bf4ef14 [oalex] proposal to higher the parallelization 2024-12-05 10:39:00 +01:00
Miriam Baglioni f4704aef4d [oalex] proposal to higher the parallelization 2024-12-05 10:27:32 +01:00