Claudio Atzori
a72b9e96ac
expand the instance level fulltext in the XML records
2023-07-27 14:57:38 +02:00
Giambattista Bloisi
e64c2854a3
Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
...
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Giambattista Bloisi
bb5b845e3c
Use scala.binary.version property to resolve scala maven dependencies
...
Ensure consistent usage of maven properties
Profile for compiling with scala 2.12 and Spark 3.4
2023-07-24 11:13:48 +02:00
Giambattista Bloisi
bd3fcf869a
rename dnet-pace-core into dhp-pace-core module and use it as dependency in other modules
2023-07-06 10:02:23 +02:00
Miriam Baglioni
b25b401065
added test to verify the advconstraints to dth community. inserted some additional logs.
2023-04-05 12:18:39 +02:00
Claudio Atzori
63b8bbc015
[graph to Solr] using dedicated sparkExecutorCores, sparkExecutorMemory, sparkDriverMemory in convert_to_xml
2023-03-24 13:43:20 +01:00
Claudio Atzori
308e10d102
serialising: 1. measures for all the entity types and 2. result level fulltext
2023-03-23 11:23:22 +01:00
Claudio Atzori
41e00bcd07
[graph provision] avoid to parse again the XML records, apparently the escaped XML characters get unescaped invalidating the record
2023-03-13 15:19:49 +01:00
Claudio Atzori
7aebedb43c
code formatting
2023-02-27 11:51:27 +01:00
Serafeim Chatzopoulos
0b5bf53b45
Remove unecessary indexed fields from Solr
2023-02-23 12:42:42 +02:00
Claudio Atzori
1b8488976b
code formatting
2022-12-07 10:45:38 +01:00
Claudio Atzori
8248da40d9
Merge branch 'beta' into graph_cleaning
2022-12-02 14:49:00 +01:00
Miriam Baglioni
ce020f2c83
[EOSC FUTURE] added resources and test for review
2022-11-30 09:57:30 +01:00
Claudio Atzori
6082d235d3
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into graph_cleaning
2022-11-28 09:54:48 +01:00
Claudio Atzori
24ef301cc1
[graph cleaning] patch the result's collectedfrom and hostedby identifiers according to the datasource master-duplicate mapping
2022-11-28 09:54:18 +01:00
Alessia Bardi
90c8f9cb61
tests for EOSC Future
2022-11-23 12:18:44 +01:00
Alessia Bardi
2832117f23
added eoscifguidelines in test
2022-11-22 18:01:12 +01:00
Alessia Bardi
2687fc9f73
tests for EOSC Future review - ROhub
2022-11-22 17:30:56 +01:00
Claudio Atzori
ff6f789b6d
code formatting
2022-09-09 15:16:31 +02:00
Alessia Bardi
5c45d52af3
testing for RiuNet
2022-09-07 15:40:57 +03:00
Claudio Atzori
499826ead1
serialising field eoscifguidelines field in the Solr XML records
2022-08-04 12:40:48 +02:00
Claudio Atzori
072f192853
include the class information in the measure XML serialization
2022-07-01 09:54:56 +02:00
Sandro La Bruzzo
4c50f35c8b
update publication Date format
2022-05-16 10:29:36 +02:00
Claudio Atzori
9e12cb3c92
EOSC Services - removed field knowledgegraph; depending on the released schema module
2022-05-03 11:55:45 +02:00
Claudio Atzori
b6a7ff3a99
EOSC Services - removed fields from mapping, testing preparation
2022-05-02 15:52:33 +02:00
Claudio Atzori
05c1ea92e9
EOSC Services - added Service-specific fields in the XML record serialization
2022-04-29 15:56:55 +02:00
Claudio Atzori
f5f532d134
EOSC Services - ongoing update
2022-04-29 12:25:24 +02:00
Claudio Atzori
c26222623f
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:32:22 +02:00
Claudio Atzori
86585a6b27
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:32:19 +02:00
Claudio Atzori
ad85d88eaf
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 13:28:35 +02:00
Claudio Atzori
598e11dfd7
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:27:02 +02:00
Claudio Atzori
db3d9877a5
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:26:58 +02:00
Claudio Atzori
3bba6d6e38
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 12:23:17 +02:00
Claudio Atzori
2ac2d928bd
[maven-release-plugin] prepare for next development iteration
2022-04-07 12:18:47 +02:00
Claudio Atzori
85bc722ff4
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 12:18:43 +02:00
Claudio Atzori
bc05b6168a
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 11:49:06 +02:00
Claudio Atzori
505420fd61
[maven-release-plugin] prepare for next development iteration
2022-04-07 11:34:06 +02:00
Claudio Atzori
66e718981e
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 11:34:02 +02:00
Claudio Atzori
48d32466e4
instances grouped by URL expose only one refereed
2022-03-23 14:52:03 +01:00
Miriam Baglioni
2b643059fa
[Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself
2022-03-11 13:56:48 +01:00
Claudio Atzori
a87c070447
conflicts resolved, merged from beta
2022-02-24 12:51:31 +01:00
Claudio Atzori
86cdb7a38f
[provision] serialize measures defined on the result level
2022-02-23 15:54:18 +01:00
Alessia Bardi
9d6203f79b
test mapping datasource
2022-02-23 15:00:53 +01:00
Alessia Bardi
600ede1798
serialisation of APCs int he XML records
2022-02-11 11:00:20 +01:00
Claudio Atzori
cccb16900c
https://support.openaire.eu/issues/7330 normalising DOI urls
2021-12-23 12:33:53 +01:00
Claudio Atzori
98eb292c59
avoid NPEs merging XMLInstance(s)
2021-12-13 13:27:20 +01:00
Claudio Atzori
5e17247bb6
avoid NPEs merging XMLInstance(s)
2021-12-13 11:48:40 +01:00
Claudio Atzori
b70ecccea0
avoid NPEs merging XMLInstance(s)
2021-12-12 12:37:38 +01:00
Alessia Bardi
e53228401b
style
2021-12-09 15:46:22 +01:00
Alessia Bardi
6b5d7688a4
#7275 serialize license information in XML records
2021-12-09 13:46:48 +01:00