Commit Graph

235 Commits

Author SHA1 Message Date
Miriam Baglioni 8dae10b442 - 2024-02-14 14:57:08 +01:00
Miriam Baglioni 83bb97be83 [Tagging Projects and Datasource] added test to check datasource tagging. Fixed issue 2024-02-14 11:23:47 +01:00
Miriam Baglioni 6e1f383e4a [Tagging Projects and Datasource] first extention of bulktagging to add the context to projects and datasource 2024-02-13 16:37:14 +01:00
Miriam Baglioni 3f7d262a4e mergin with branch beta 2024-02-13 14:05:58 +01:00
Miriam Baglioni a5995ab557 [orcid-enrichment] change the value of parameters. 2024-01-29 18:19:48 +01:00
Miriam Baglioni e0ec800d7e [BulkTagging] extend the definition of the pathMap to include also actions that should be performed of the value extracted from the result befor applying the constraint 2024-01-23 11:34:53 +01:00
Claudio Atzori c6b3401596 increased shuffle partitions for publications in the country propagation workflow 2024-01-19 10:15:39 +01:00
Miriam Baglioni bcc0a13981 [enrichment single step] adding <end> element in wf definition 2024-01-18 17:39:14 +01:00
Miriam Baglioni 6af536541d [enrichment single step] moving parameter file in correct location 2024-01-18 15:35:40 +01:00
Miriam Baglioni a12a3eb143 - 2024-01-18 15:18:10 +01:00
Miriam Baglioni 82e9e262ee [enrichment single step] remove parameter from execution 2024-01-17 17:38:03 +01:00
Miriam Baglioni 67ce2d54be [enrichment single step] refactoring to fix issues in disappeared result type 2024-01-17 16:50:00 +01:00
Miriam Baglioni 59eaccbd87 [enrichment single step] refactoring to fix issue in disappeared result type 2024-01-15 17:49:54 +01:00
Miriam Baglioni cb14470ba6 added properties file in the forlder for the workflow of result to organization from inst repo propagation. Changes the path in the classes implementing the propagation 2023-12-22 14:50:05 +01:00
Miriam Baglioni 9f966b59d4 added properties file in the forlder for the workflow of result to community from semrel propagation. Changes the path in the classes implementing the propagation 2023-12-22 14:11:47 +01:00
Miriam Baglioni 2f3b5a133d added properties file in the forlder for the workflow of result to community from organization propagation. Changes the path in the classes implementing the propagation 2023-12-22 13:56:40 +01:00
Miriam Baglioni 2f7b9ad815 added properties file in the forlder for the workflow of project to result propagation. Changes the path in the classes implementing the propagation 2023-12-22 11:46:15 +01:00
Miriam Baglioni f2352e8a78 changed in the classes the path for the property files for the propagation of community from project 2023-12-22 11:43:34 +01:00
Miriam Baglioni 009730b3d1 added properties file in the forlder for the workflow of orcid propagation. Changes the path in the classes implementing the propagationchanged the path to the parameter file in the class for entitytoorganization propagation 2023-12-22 11:42:09 +01:00
Miriam Baglioni 89f269c7f4 changed the path to the parameter file in the class for entitytoorganization propagation 2023-12-22 11:37:50 +01:00
Miriam Baglioni b06aea0adf adding the bulkTag parameter file in the folder for the oozie workflow for bulkTagging. Changes the path in the class 2023-12-22 11:35:37 +01:00
Miriam Baglioni 3afd4aa57b adjustments for country propagation 2023-12-22 11:27:30 +01:00
Miriam Baglioni 4740c808f7 - 2023-12-20 14:26:54 +01:00
Miriam Baglioni d410ea8a41 added needed parameter 2023-12-19 12:15:01 +01:00
Miriam Baglioni 3eca5d2e1c - 2023-12-18 09:55:27 +01:00
Miriam Baglioni 0d8e496a63 - 2023-12-15 12:16:43 +01:00
Miriam Baglioni 8752d275fa removed not needed parameter 2023-12-09 15:24:45 +01:00
Miriam Baglioni d4eedada71 adjusting workflow definition 2023-12-09 15:20:11 +01:00
Miriam Baglioni 616622d2bb first version of the workflow single step 2023-12-07 09:59:52 +01:00
Claudio Atzori c5b7253130 [community_organization propagation] fixed workflow parameters 2023-12-05 09:13:33 +01:00
Claudio Atzori 3c3bdb8318 [bulktagging] fixed workflow parameters 2023-12-05 09:08:48 +01:00
Miriam Baglioni 8eb70e6657 refactoring 2023-11-27 15:13:15 +01:00
Miriam Baglioni 48e0427a23 changed the parameter from production to baseURL. Fixed issue in tagging configuration 2023-11-27 15:10:27 +01:00
Miriam Baglioni eaf0a702de - 2023-11-14 14:53:34 +01:00
Miriam Baglioni 7b1e34f159 refactoring 2023-11-03 15:30:01 +01:00
Miriam Baglioni 638ad9e74f changing test for new implementation 2023-11-03 15:06:50 +01:00
Miriam Baglioni edcb17ca98 refactoring and test 2023-11-03 13:01:14 +01:00
Miriam Baglioni 937ff6a7c7 - 2023-10-31 15:56:08 +01:00
Miriam Baglioni a737dd47b6 removed not needed test class 2023-10-31 15:54:49 +01:00
Miriam Baglioni c80b768af0 test for project propagation 2023-10-31 15:49:42 +01:00
Miriam Baglioni e9a20fc8f6 mergin with branch beta 2023-10-31 14:36:03 +01:00
Miriam Baglioni a4214ced1e fixing issue on propagation organization. added --config to workflow definition. added oozie_app to communtiy project 2023-10-20 10:14:20 +02:00
Claudio Atzori b0fed1725e avoid NPEs 2023-10-19 12:13:45 +02:00
Sandro La Bruzzo a5a89a702f new spark parrameter updated 2023-10-16 11:46:12 +02:00
Miriam Baglioni 159388f9c2 testing and fix some issues 2023-10-16 11:26:07 +02:00
Miriam Baglioni 89184d5b4f used the API instead of the IS for bulktagging and propagation for community through organization. Added a new propagation step for communities through projects. Still using the API and not the IS 2023-10-11 18:17:35 +02:00
Miriam Baglioni a3d01ccb24 refactoring 2023-10-09 14:52:17 +02:00
Miriam Baglioni 3d6be20989 changes to use the API instead of the IS the get the information for the communities to be used during bulktagging and context propagation 2023-10-09 14:26:33 +02:00
Giambattista Bloisi e64c2854a3 Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Giambattista Bloisi 54c1eacef1 SparkJobTest was failing because testing workingdir was not cleaned up after eact test 2023-07-21 10:42:24 +02:00