Commit Graph

322 Commits

Author SHA1 Message Date
Claudio Atzori e4abe55988 merged person_through_the_graph & code formatting 2024-10-28 11:01:49 +01:00
Claudio Atzori 6fd50266f1 translate 'otherresearchproduct' into 'other' when setting the related record type 2024-10-28 10:42:46 +01:00
Claudio Atzori 32fa579b80 [graph provision] select the longest abstract 2024-10-28 10:03:02 +01:00
Miriam Baglioni 0fb6af5586 Updated main pom dependency against dhp-schema, from 8.0.1 to 9.0.0. The new fields included in the updated schema module are populated by the Solr JSON payload mapping, which also limits the number of authors serialised to 200. 2024-10-25 16:28:50 +02:00
Claudio Atzori e5df68772d [graph provision] fixed serialisation of the usage counts as measures in the XML records 2024-10-02 09:35:21 +02:00
Claudio Atzori 4f0463d779 [graph provision] person serialisation, limit the number of authorships and coauthorships before expanding the payloads 2024-09-24 14:54:34 +02:00
Claudio Atzori d1cadc77c9 [graph provision] person serialisation, limit the number of authorships and coauthorships before expanding the payloads 2024-09-24 10:57:20 +02:00
Claudio Atzori e0ff84baf0 [graph provision] person serialisation, limit the number of authorships and coauthorships before expanding the payloads 2024-09-23 10:29:46 +02:00
Claudio Atzori 5f86c93be6 [graph provision] person serialisation 2024-09-20 12:20:00 +02:00
Michele Artini bb9cee4f40 implementation of gtr2Publications plugin 2024-09-16 14:16:56 +02:00
Miriam Baglioni 45605f93ae merging with branch beta 2024-08-12 18:03:10 +02:00
Claudio Atzori 975d44cac7 [graph provision] added person to the provision workflow 2024-08-02 16:14:10 +02:00
Claudio Atzori a81c555fe6 [graph provision] include only FoS L1..L2 in the record serialization 2024-07-25 15:26:47 +02:00
Claudio Atzori 359b8ebda8 [graph provision] include only FoS L1..L2 in the record serialization 2024-07-25 15:22:29 +02:00
Claudio Atzori d4bf449e8c minor 2024-07-25 14:53:06 +02:00
Claudio Atzori 01958a3e07 [graph provision] addded filter to exclude records marked with datainfo.deletedbyinference = true 2024-07-24 10:00:10 +02:00
Claudio Atzori 83327239de fixed pom definitions, bumped dependency version for the dhp-schema module, removed unnecessary dependencies 2024-07-17 11:58:48 +02:00
Claudio Atzori beb93cdfe9 [graph provision] expand the context info for each entity type 2024-07-16 11:43:48 +02:00
Claudio Atzori 38f8ed27fd [graph provision] log the Solr admin application operations for alias deletion and creation 2024-07-15 16:30:43 +02:00
Claudio Atzori 14539f9c8b [graph provision] publicFormat worfklow parameter defined as optional 2024-06-28 14:55:18 +02:00
Claudio Atzori 1bc8c5d173 [graph provision] fixed serialization of the instancetypes 2024-06-28 14:54:28 +02:00
Claudio Atzori 1ccf01cdb8 Using the updated Solr JSON payload model classes 2024-06-28 12:38:07 +02:00
Claudio Atzori 1c30eacac2 updated index feeding procedure to exploit the collection aliases 2024-06-25 15:27:38 +02:00
Claudio Atzori 6055212f77 merged from the json_payload branch 2024-06-25 12:39:02 +02:00
Serafeim Chatzopoulos 9f6e16a03c Add support to cretate/update solr collection aliases 2024-06-20 16:03:15 +03:00
Claudio Atzori f70dc76b61 minor 2024-06-06 10:43:10 +02:00
Claudio Atzori da5c1e73a4 Merge pull request 'Irish oaipmh exporter' (#443) from irish-oaipmh-exporter into beta
Reviewed-on: #443
2024-06-05 10:55:09 +02:00
Claudio Atzori 81090ad593 [IE OAIPHM] added oozie workflow, minor changes, code formatting 2024-06-05 10:03:33 +02:00
Claudio Atzori 0d5bdb2db0 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2024-05-27 11:59:02 +02:00
Sandro La Bruzzo 66c1ffc866 merged again from beta (I hope for the last time) 2024-05-22 11:02:46 +02:00
Claudio Atzori 834461ba26 [graph provision]fixed wf definition, revised serialization of the usage counts measures 2024-05-21 13:48:06 +02:00
Claudio Atzori 92f018d196 [graph provision] fixed path pointing to an intermediate data store in the working directory 2024-05-15 15:39:18 +02:00
Claudio Atzori 0611c81a2f [graph provision] using Qualifier.classNames to populate the correponsing fields in the JSON payload 2024-05-15 15:33:10 +02:00
Michele Artini 2b3b5fe9a1 oai finalization and test 2024-05-15 14:13:16 +02:00
Claudio Atzori 1efe7f7e39 [graph provision] upgrade to dhp-schema:6.1.2, included project.oamandatepublications in the JSON payload mapping, fixed serialisation of the usageCounts measures 2024-05-14 12:39:31 +02:00
Claudio Atzori 55f39f7850 [graph provision] adds the possibility to validate the XML records before storing them via the validateXML parameter 2024-05-09 14:06:04 +02:00
Claudio Atzori 39a2afe8b5 [graph provision] fixed XML serialization of the usage counts measures, renamed workflow actions to better reflect their role 2024-05-09 13:54:42 +02:00
Claudio Atzori 18aa323ee9 cleanup unused classes, adjustments in the oozie wf definition 2024-05-08 11:36:46 +02:00
Michele Artini c9a327bc50 refactoring of gzip method 2024-05-08 11:34:08 +02:00
Michele Artini e234848af8 oaf record: xpath for root 2024-05-08 10:00:53 +02:00
Claudio Atzori b4e3389432 fixed property mapping creating the RelatedEntity transient objects. spark cores & memory adjustments. Code formatting 2024-05-07 16:25:17 +02:00
Giambattista Bloisi 711048ceed PrepareRelationsJob rewritten to use Spark Dataframe API and Windowing functions 2024-05-07 15:44:33 +02:00
Michele Artini 70bf6ac415 oai exporter tests 2024-05-07 09:36:26 +02:00
Michele Artini aa40e53c19 oai exporter parameters 2024-05-07 08:01:19 +02:00
Michele Artini ed052a3476 job for the population of the oai database 2024-05-06 16:08:33 +02:00
Sandro La Bruzzo 0d628cd62b merged again from beta 2024-04-23 17:34:55 +02:00
Claudio Atzori 3a027e97a7 [graph indexing] sets spark memoryOverhead in the join operations to the same value used for the memory executor 2024-04-19 16:59:58 +02:00
Sandro La Bruzzo b84ad0c06e merged beta 2024-04-19 14:39:59 +02:00
Claudio Atzori ef52128c55 included new stats* workflows in parent pom list of modules, code formatting 2024-03-26 10:42:10 +01:00
Claudio Atzori bfba71a95c further follow up changes from integrating the mergeutils branch 2024-03-26 09:01:18 +01:00