Miriam Baglioni
|
ad691c28c2
|
[oalex] change to add a thread to monitor the number of operations done by affro up to a certain point
|
2024-12-06 10:19:53 +01:00 |
Miriam Baglioni
|
2806511e02
|
[oalex] change collec_list to collect_set so that the same match will be there just one time
|
2024-12-05 21:26:08 +01:00 |
Miriam Baglioni
|
0043e4051f
|
[oalex] renaming
|
2024-12-05 18:44:06 +01:00 |
Miriam Baglioni
|
a59d0ce9fc
|
[oalex] avoid redefinition of explode function
|
2024-12-05 18:41:16 +01:00 |
Miriam Baglioni
|
e2f8007433
|
[oalex] added fix
|
2024-12-05 16:50:10 +01:00 |
Miriam Baglioni
|
f8479083f2
|
[oalex] pasing the schema to avoid changing in confidence type
|
2024-12-05 16:44:17 +01:00 |
Miriam Baglioni
|
9440f863c9
|
[oalex] changed implementation passing throguh rdd to avoi calling udf function
|
2024-12-05 16:36:38 +01:00 |
Miriam Baglioni
|
f78456288c
|
[oalex] fix issue
|
2024-12-05 12:54:10 +01:00 |
Miriam Baglioni
|
997f2e492f
|
[oalex] change the call of the function in the dataframe
|
2024-12-05 12:35:59 +01:00 |
Miriam Baglioni
|
982a1b0b9f
|
[oalex] change the call of the function in the dataframe
|
2024-12-05 12:21:21 +01:00 |
Miriam Baglioni
|
4fe3d31ed5
|
[oalex] register the UDF oalex_affro and the schema of the output to be used in the dataframe by pyspark
|
2024-12-05 12:18:45 +01:00 |
Miriam Baglioni
|
efa4db4e52
|
[oalex] execute affRo on distinct affilitaion_strings
|
2024-12-05 12:02:40 +01:00 |
Miriam Baglioni
|
ea2e27a9f4
|
[oalex] fix python syntax errors
|
2024-12-05 11:22:10 +01:00 |
Miriam Baglioni
|
e33bf4ef14
|
[oalex] proposal to higher the parallelization
|
2024-12-05 10:39:00 +01:00 |
Miriam Baglioni
|
f4704aef4d
|
[oalex] proposal to higher the parallelization
|
2024-12-05 10:27:32 +01:00 |