Claudio Atzori
0c05abe50b
[graph indexing] sets spark memoryOverhead in the join operations to the same value used for the memory executor
2024-04-19 16:57:55 +02:00
Claudio Atzori
9b13c22e5d
[graph provision] retrieve all the context information by adding all=true to the requests issued to thr API
2024-01-23 15:36:08 +01:00
Claudio Atzori
f87f3a6483
[graph provision] updated param specification for the XML converter job
2024-01-23 08:54:37 +01:00
Claudio Atzori
1c6db320f4
[graph provision] obtain context info from the context API instead from the ISLookUp service
2024-01-22 15:53:17 +01:00
Claudio Atzori
321922772b
added serialization for the new fields imported for the Irish tender
2023-12-05 16:37:04 +01:00
Claudio Atzori
a72b9e96ac
expand the instance level fulltext in the XML records
2023-07-27 14:57:38 +02:00
Giambattista Bloisi
e64c2854a3
Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
...
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Claudio Atzori
63b8bbc015
[graph to Solr] using dedicated sparkExecutorCores, sparkExecutorMemory, sparkDriverMemory in convert_to_xml
2023-03-24 13:43:20 +01:00
Claudio Atzori
308e10d102
serialising: 1. measures for all the entity types and 2. result level fulltext
2023-03-23 11:23:22 +01:00
Claudio Atzori
41e00bcd07
[graph provision] avoid to parse again the XML records, apparently the escaped XML characters get unescaped invalidating the record
2023-03-13 15:19:49 +01:00
Serafeim Chatzopoulos
0b5bf53b45
Remove unecessary indexed fields from Solr
2023-02-23 12:42:42 +02:00
Claudio Atzori
499826ead1
serialising field eoscifguidelines field in the Solr XML records
2022-08-04 12:40:48 +02:00
Claudio Atzori
072f192853
include the class information in the measure XML serialization
2022-07-01 09:54:56 +02:00
Sandro La Bruzzo
4c50f35c8b
update publication Date format
2022-05-16 10:29:36 +02:00
Claudio Atzori
9e12cb3c92
EOSC Services - removed field knowledgegraph; depending on the released schema module
2022-05-03 11:55:45 +02:00
Claudio Atzori
05c1ea92e9
EOSC Services - added Service-specific fields in the XML record serialization
2022-04-29 15:56:55 +02:00
Claudio Atzori
f5f532d134
EOSC Services - ongoing update
2022-04-29 12:25:24 +02:00
Claudio Atzori
48d32466e4
instances grouped by URL expose only one refereed
2022-03-23 14:52:03 +01:00
Claudio Atzori
a87c070447
conflicts resolved, merged from beta
2022-02-24 12:51:31 +01:00
Claudio Atzori
86cdb7a38f
[provision] serialize measures defined on the result level
2022-02-23 15:54:18 +01:00
Alessia Bardi
600ede1798
serialisation of APCs int he XML records
2022-02-11 11:00:20 +01:00
Claudio Atzori
cccb16900c
https://support.openaire.eu/issues/7330 normalising DOI urls
2021-12-23 12:33:53 +01:00
Claudio Atzori
98eb292c59
avoid NPEs merging XMLInstance(s)
2021-12-13 13:27:20 +01:00
Claudio Atzori
5e17247bb6
avoid NPEs merging XMLInstance(s)
2021-12-13 11:48:40 +01:00
Claudio Atzori
b70ecccea0
avoid NPEs merging XMLInstance(s)
2021-12-12 12:37:38 +01:00
Alessia Bardi
e53228401b
style
2021-12-09 15:46:22 +01:00
Alessia Bardi
6b5d7688a4
#7275 serialize license information in XML records
2021-12-09 13:46:48 +01:00
Claudio Atzori
9cac283bec
implemented Instance serialization features requested in https://support.openaire.eu/issues/7156
2021-12-02 17:20:33 +01:00
Claudio Atzori
e471f12d5e
hotfix: recovered implementation removing the hardcoded working_dirs
2021-10-19 12:35:38 +02:00
Sandro La Bruzzo
d4dadf6d77
reduced max number of PID in Relatedentity
2021-09-02 14:21:24 +02:00
Sandro La Bruzzo
9f8a80deb7
fixed wrong import of unresolved relation in openaire
2021-09-01 14:16:27 +02:00
Alessia Bardi
3762b17f7b
added VERSIOn and PART relationship and re-ordered according to my personal and obviously possibly biased
...
ordering
2021-08-31 20:20:05 +02:00
Alessia Bardi
931f430129
Merge branch 'beta' into datasource_model_eosc_beta
2021-08-23 11:57:21 +02:00
Claudio Atzori
9f4db73f30
updated/fixed unit tests
2021-08-11 15:02:51 +02:00
Claudio Atzori
2ee21da43b
suggestions from SonarLint
2021-08-11 12:13:22 +02:00
Sandro La Bruzzo
6358f92c3a
added sleep to solve problem of lost request of creating index
2021-07-30 08:54:37 +02:00
Claudio Atzori
c53d106e80
[provision] lowercase relation filter
2021-07-29 13:57:00 +02:00
Sandro La Bruzzo
3721df7aa6
refactoring create actionset of scholexplorer, moved on package dhp-aggregation
2021-07-29 10:45:35 +02:00
Sandro La Bruzzo
3d8f0f629b
implemented workflow of creation action set for scholexplorer
2021-07-28 16:15:34 +02:00
Michele Artini
3e2a2d6e71
added new fields in xml
2021-07-28 11:56:55 +02:00
Claudio Atzori
2fff24df55
code formatting
2021-07-28 11:34:19 +02:00
Sandro La Bruzzo
16c91203bd
implemented workflow of creation action set for scholexplorer
2021-07-28 10:30:49 +02:00
Michele Artini
52e2315ba2
removed trick for datasourcetypeui
2021-07-28 10:23:00 +02:00
Claudio Atzori
10d7b4f0b4
filtering 'old' OpenAIRE ids from the entity.originalId[] array in the OAF -> XML searialization procedure
2021-07-20 11:52:05 +02:00
Sandro La Bruzzo
bbe8193930
merged stable ids
2021-07-12 17:00:43 +02:00
Sandro La Bruzzo
57c74c73c6
fixed mistakes in oozie workflow
2021-07-09 12:28:09 +02:00
Sandro La Bruzzo
61ccb54fde
removed wrong loop on oozie wf
2021-07-09 12:17:57 +02:00
Sandro La Bruzzo
9f5a0f3ab6
moved wf indexing of Scholexplorer in dhp-graph-provision
2021-07-09 12:06:43 +02:00
Claudio Atzori
96238152cb
added serialization for alternateIdentifiers and pids within each record instance
2021-05-28 16:57:30 +02:00
Claudio Atzori
23b8883ab1
applied intellij code cleanup
2021-05-14 10:58:12 +02:00