1
0
Fork 0
Commit Graph

310 Commits

Author SHA1 Message Date
Miriam Baglioni 849b75593e resolved conflicts 2024-12-20 09:21:22 +01:00
Miriam Baglioni 2d45f125a7 [bulktag subcommunities] refactoring and addition of new properties 2024-12-20 09:06:55 +01:00
Giambattista Bloisi 85dced4ffb Implement new jobs for collecting data from latest graph on hive and deltas from oaf mdstores (datacite and crossref)
Optimized CopyHdfsOafSparkApplication
2024-12-19 14:37:48 +01:00
Claudio Atzori e4b814b3f1 code formatting 2024-12-06 13:58:39 +01:00
Claudio Atzori 9e6b1f2f24 Merge pull request 'Communities_patents' (#514) from Communities_patents into beta
Reviewed-on: D-Net/dnet-hadoop#514
2024-12-06 13:50:43 +01:00
Miriam Baglioni 666155bafa [communityfromsemrelpropagation] changed resource to have deletedbyinference = false. 2024-12-06 12:26:41 +01:00
Miriam Baglioni ee84db7a6a [communityfromsemrelpropagation] added filtering to remove the deletedbyinference and invisible results 2024-12-06 12:20:13 +01:00
Giambattista Bloisi fed13e083e Fix: do not import joda
formatting
2024-12-05 15:21:32 +01:00
Miriam Baglioni ca2d480df3 [BulkTagging] added fix to consider when the set of constraints for the datasource is empty. Added check for remove constraints and advanced constraints to verify if the constraints list is empty and in that case do nothing 2024-11-26 15:56:52 +01:00
Miriam Baglioni 189a7c255a [patents] added test and resources 2024-11-25 16:52:13 +01:00
Miriam Baglioni 821700299a [patents] added test and resources 2024-11-22 17:21:58 +01:00
Miriam Baglioni 2570023590 [Subcommunities] modified bulktagging workflow to include the new parameters 2024-11-21 14:47:17 +01:00
Miriam Baglioni c0729ac279 [Subcommunities] added remapping to master datasource 2024-11-21 14:36:26 +01:00
Miriam Baglioni ab96983647 [Subcommunities] added remapping to representative organization 2024-11-21 12:35:05 +01:00
Miriam Baglioni 0656ed568d [Subcommunities] remove not needed methods used to create datasourceCommunityMap 2024-11-21 11:05:58 +01:00
Miriam Baglioni ba9f1982b3 [Subcommunities] used the two new access point to directly get the organizationCOmmunityMap and the datasourceCommunityMap 2024-11-21 11:04:58 +01:00
Miriam Baglioni 9ee061ee90 [Subcommunities] added to the list of the communities also the sub community identifiers 2024-11-21 11:02:52 +01:00
Miriam Baglioni e5b04e61ff [CommunityPatents] extends the community propagation considering also the results of type patents linked with a isrelatedto semantcis 2024-11-21 10:20:12 +01:00
Miriam Baglioni 896de42598 [CommunityAPI] use of new access point to directly get the organizationCommunityMap and the datasouceCommunityMap for all the communities and subcommunities. To be changed in the propagation code when implemented in the APIs 2024-11-20 17:44:33 +01:00
Miriam Baglioni 3081cad1d3 [CommunityAPI] refactoring 2024-11-20 14:03:59 +01:00
Miriam Baglioni 6beb94adee [SubCommunity] Extention of the Utils methods to add also the associations between the subcommunities and organization/project/datasources 2024-11-20 10:59:49 +01:00
Miriam Baglioni 9dbcf19efb [SubCommunity] Extention of communityApis to add also the associations between the subcommunities and organization/project/datasources 2024-11-20 09:16:33 +01:00
Miriam Baglioni cea2de2c37 [SubCommunity] Extention of CommunityAPIs fro bulk tagging 2024-11-19 14:50:42 +01:00
Miriam Baglioni 69aee609ef [bulktag] align type to community api 2024-10-29 15:53:04 +01:00
Claudio Atzori e4abe55988 merged person_through_the_graph & code formatting 2024-10-28 11:01:49 +01:00
Miriam Baglioni 1fce7d5a0f [Person] remove the isolated nodes from the person set 2024-10-25 10:05:17 +02:00
Miriam Baglioni 32f444984e [person] - 2024-10-24 17:51:42 +02:00
Miriam Baglioni a7699558ed [person] - 2024-10-24 16:15:12 +02:00
Miriam Baglioni 01679c935a [person] added test class to be implemented 2024-10-24 15:27:06 +02:00
Miriam Baglioni c773421cc7 [person] added new substep in propagation worflow main 2024-10-24 14:44:13 +02:00
Miriam Baglioni cf07ed9058 [person] refactoring 2024-10-24 14:35:14 +02:00
Miriam Baglioni c921cf7ee0 [personEntity] removed the deletedbyinference results (not indexed, but still in the graph). Changed the writing mode: append instead of overwrite 2024-10-24 09:57:20 +02:00
Giambattista Bloisi 0e34b0ece1 Fix imports: point them from the main distribution packages 2024-10-23 14:01:52 +02:00
Claudio Atzori 9486e21a44 copy or process the person records throughout the graph pipeline 2024-07-30 14:25:31 +02:00
Miriam Baglioni 9d27910144 [BulkTag]added tagging for the organization relevant for the community. Added test. Changed the tagging variables. 2024-07-16 13:48:48 +02:00
Miriam Baglioni 1477406ecc [bulkTag] fixed issue that made project disappear in graph_10_enriched 2024-06-06 10:45:41 +02:00
Claudio Atzori 11bd89e132 [enrichment] use sparkExecutorMemory to define also the memoryOverhead 2024-05-01 08:32:59 +02:00
Giambattista Bloisi 1878199dae Miscellaneous fixes:
- in Merge By ID pick by preference those records coming from delegated Authorities
- fix various tests
- close spark session in SparkCreateSimRels
2024-04-24 08:12:45 +02:00
Sandro La Bruzzo b72c3139e2 updated Ignore annotation that is deprecated to Disabled 2024-04-19 14:52:40 +02:00
Claudio Atzori 75551ad4ec code formatting 2024-03-26 14:53:16 +01:00
Miriam Baglioni 94b931f7bd [BulkTagging - tag datasource and projects]merging with branch beta 2024-03-26 14:25:19 +01:00
Miriam Baglioni 3b209261f2 [BulkTagging - tag datasource and projects]merging with branch beta 2024-03-26 14:21:27 +01:00
Claudio Atzori ef52128c55 included new stats* workflows in parent pom list of modules, code formatting 2024-03-26 10:42:10 +01:00
Claudio Atzori 91b61687fa Merge branch 'beta' into bulkTaggingPathMapExtention 2024-03-25 15:50:18 +01:00
Giambattista Bloisi 664a381d31 Unify merge logic of entities in MergeUtils.class 2024-03-18 16:04:49 +01:00
Sandro La Bruzzo 7d806a434c formatted code 2024-02-28 09:31:58 +01:00
Miriam Baglioni 43da7e1191 [Tagging Projects and Datasource] changed the way the pathMap parameter is passed. It was too long and was truncated 2024-02-19 16:12:59 +01:00
Miriam Baglioni 8dae10b442 - 2024-02-14 14:57:08 +01:00
Miriam Baglioni 83bb97be83 [Tagging Projects and Datasource] added test to check datasource tagging. Fixed issue 2024-02-14 11:23:47 +01:00
Miriam Baglioni 6e1f383e4a [Tagging Projects and Datasource] first extention of bulktagging to add the context to projects and datasource 2024-02-13 16:37:14 +01:00