Claudio Atzori
078169b922
cleanup
2024-03-15 09:56:04 +01:00
Claudio Atzori
af154d4456
implemented changes from #9497 : sort abstracts by string length, included author fullnames in the related results, expanded instance details within each children/result XML element
2024-03-14 16:21:23 +01:00
Claudio Atzori
7863c92466
expanded paper abstract in the result/children XML element (ticket #9497 )
2024-03-13 16:25:31 +01:00
Claudio Atzori
eb5887cb9a
including related organization url in the XML record serialization (ticket #9498 )
2024-03-13 14:46:00 +01:00
Claudio Atzori
db66555ebb
WIP: updated provision workflow to create a JSON based representation of the payload
2024-03-12 09:56:09 +01:00
Claudio Atzori
d4871b31e8
WIP: extended provision workflow to create the JSON based payload
2024-03-08 11:43:20 +01:00
Claudio Atzori
6fcf872daa
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into index_records
2024-02-28 10:27:28 +01:00
Claudio Atzori
3f07390a58
WIP
2024-02-28 10:10:10 +01:00
Sandro La Bruzzo
7d806a434c
formatted code
2024-02-28 09:31:58 +01:00
Alessia Bardi
f2a08d8cc2
test for Italian records from IRS repositories
2024-01-30 19:20:14 +01:00
Claudio Atzori
9b13c22e5d
[graph provision] retrieve all the context information by adding all=true to the requests issued to thr API
2024-01-23 15:36:08 +01:00
Claudio Atzori
f87f3a6483
[graph provision] updated param specification for the XML converter job
2024-01-23 08:54:37 +01:00
Claudio Atzori
1c6db320f4
[graph provision] obtain context info from the context API instead from the ISLookUp service
2024-01-22 15:53:17 +01:00
Miriam Baglioni
5011c4d11a
refactoring after compiletion
2023-12-20 15:57:26 +01:00
Claudio Atzori
ff924215b8
[graph provision] added tests for new peerreviewed field
2023-12-12 11:21:30 +01:00
Claudio Atzori
7e8eff40c1
[graph provision] added tests for the new model fields
2023-12-12 08:54:15 +01:00
Giambattista Bloisi
613ec5ffce
Add profiles for different spark versions: spark-24, spark-34, spark-35
2023-12-05 19:11:06 +01:00
Giambattista Bloisi
2fa78f6071
Changes requires to build and run tests with Java 17
2023-12-05 19:11:06 +01:00
Giambattista Bloisi
326c9dc08c
Changes in maven poms to build and test the project using Spark 3.4.x and scala 2.12
2023-12-05 19:11:06 +01:00
Claudio Atzori
321922772b
added serialization for the new fields imported for the Irish tender
2023-12-05 16:37:04 +01:00
Alessia Bardi
cc7204a089
tests for d4science catalog
2023-09-20 15:38:32 +02:00
Claudio Atzori
a72b9e96ac
expand the instance level fulltext in the XML records
2023-07-27 14:57:38 +02:00
Giambattista Bloisi
e64c2854a3
Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
...
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
Giambattista Bloisi
bb5b845e3c
Use scala.binary.version property to resolve scala maven dependencies
...
Ensure consistent usage of maven properties
Profile for compiling with scala 2.12 and Spark 3.4
2023-07-24 11:13:48 +02:00
Giambattista Bloisi
bd3fcf869a
rename dnet-pace-core into dhp-pace-core module and use it as dependency in other modules
2023-07-06 10:02:23 +02:00
Miriam Baglioni
b25b401065
added test to verify the advconstraints to dth community. inserted some additional logs.
2023-04-05 12:18:39 +02:00
Claudio Atzori
63b8bbc015
[graph to Solr] using dedicated sparkExecutorCores, sparkExecutorMemory, sparkDriverMemory in convert_to_xml
2023-03-24 13:43:20 +01:00
Claudio Atzori
308e10d102
serialising: 1. measures for all the entity types and 2. result level fulltext
2023-03-23 11:23:22 +01:00
Claudio Atzori
41e00bcd07
[graph provision] avoid to parse again the XML records, apparently the escaped XML characters get unescaped invalidating the record
2023-03-13 15:19:49 +01:00
Claudio Atzori
7aebedb43c
code formatting
2023-02-27 11:51:27 +01:00
Serafeim Chatzopoulos
0b5bf53b45
Remove unecessary indexed fields from Solr
2023-02-23 12:42:42 +02:00
Claudio Atzori
1b8488976b
code formatting
2022-12-07 10:45:38 +01:00
Claudio Atzori
8248da40d9
Merge branch 'beta' into graph_cleaning
2022-12-02 14:49:00 +01:00
Miriam Baglioni
ce020f2c83
[EOSC FUTURE] added resources and test for review
2022-11-30 09:57:30 +01:00
Claudio Atzori
6082d235d3
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into graph_cleaning
2022-11-28 09:54:48 +01:00
Claudio Atzori
24ef301cc1
[graph cleaning] patch the result's collectedfrom and hostedby identifiers according to the datasource master-duplicate mapping
2022-11-28 09:54:18 +01:00
Alessia Bardi
90c8f9cb61
tests for EOSC Future
2022-11-23 12:18:44 +01:00
Alessia Bardi
2832117f23
added eoscifguidelines in test
2022-11-22 18:01:12 +01:00
Alessia Bardi
2687fc9f73
tests for EOSC Future review - ROhub
2022-11-22 17:30:56 +01:00
Claudio Atzori
ff6f789b6d
code formatting
2022-09-09 15:16:31 +02:00
Alessia Bardi
5c45d52af3
testing for RiuNet
2022-09-07 15:40:57 +03:00
Claudio Atzori
499826ead1
serialising field eoscifguidelines field in the Solr XML records
2022-08-04 12:40:48 +02:00
Claudio Atzori
072f192853
include the class information in the measure XML serialization
2022-07-01 09:54:56 +02:00
Sandro La Bruzzo
4c50f35c8b
update publication Date format
2022-05-16 10:29:36 +02:00
Claudio Atzori
9e12cb3c92
EOSC Services - removed field knowledgegraph; depending on the released schema module
2022-05-03 11:55:45 +02:00
Claudio Atzori
b6a7ff3a99
EOSC Services - removed fields from mapping, testing preparation
2022-05-02 15:52:33 +02:00
Claudio Atzori
05c1ea92e9
EOSC Services - added Service-specific fields in the XML record serialization
2022-04-29 15:56:55 +02:00
Claudio Atzori
f5f532d134
EOSC Services - ongoing update
2022-04-29 12:25:24 +02:00
Claudio Atzori
c26222623f
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:32:22 +02:00
Claudio Atzori
86585a6b27
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:32:19 +02:00
Claudio Atzori
ad85d88eaf
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 13:28:35 +02:00
Claudio Atzori
598e11dfd7
[maven-release-plugin] prepare for next development iteration
2022-04-07 13:27:02 +02:00
Claudio Atzori
db3d9877a5
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 13:26:58 +02:00
Claudio Atzori
3bba6d6e38
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 12:23:17 +02:00
Claudio Atzori
2ac2d928bd
[maven-release-plugin] prepare for next development iteration
2022-04-07 12:18:47 +02:00
Claudio Atzori
85bc722ff4
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 12:18:43 +02:00
Claudio Atzori
bc05b6168a
[maven-release-plugin] rollback the release of dhp-1.2.4
2022-04-07 11:49:06 +02:00
Claudio Atzori
505420fd61
[maven-release-plugin] prepare for next development iteration
2022-04-07 11:34:06 +02:00
Claudio Atzori
66e718981e
[maven-release-plugin] prepare release dhp-1.2.4
2022-04-07 11:34:02 +02:00
Claudio Atzori
48d32466e4
instances grouped by URL expose only one refereed
2022-03-23 14:52:03 +01:00
Miriam Baglioni
2b643059fa
[Country Propagation] changed the logic to get the collectedfrom at the result level. To fix issue when no instance is created for a result that should have the country associated. Change the code to use spark instead of hive to prepare the data needed for the propagation step. Added new tests for the intermediate steps and new verification for the propagation itself
2022-03-11 13:56:48 +01:00
Claudio Atzori
a87c070447
conflicts resolved, merged from beta
2022-02-24 12:51:31 +01:00
Claudio Atzori
86cdb7a38f
[provision] serialize measures defined on the result level
2022-02-23 15:54:18 +01:00
Alessia Bardi
9d6203f79b
test mapping datasource
2022-02-23 15:00:53 +01:00
Alessia Bardi
600ede1798
serialisation of APCs int he XML records
2022-02-11 11:00:20 +01:00
Claudio Atzori
cccb16900c
https://support.openaire.eu/issues/7330 normalising DOI urls
2021-12-23 12:33:53 +01:00
Claudio Atzori
98eb292c59
avoid NPEs merging XMLInstance(s)
2021-12-13 13:27:20 +01:00
Claudio Atzori
5e17247bb6
avoid NPEs merging XMLInstance(s)
2021-12-13 11:48:40 +01:00
Claudio Atzori
b70ecccea0
avoid NPEs merging XMLInstance(s)
2021-12-12 12:37:38 +01:00
Alessia Bardi
e53228401b
style
2021-12-09 15:46:22 +01:00
Alessia Bardi
6b5d7688a4
#7275 serialize license information in XML records
2021-12-09 13:46:48 +01:00
Claudio Atzori
9cac283bec
implemented Instance serialization features requested in https://support.openaire.eu/issues/7156
2021-12-02 17:20:33 +01:00
Claudio Atzori
1de881b796
resolved conflicts for #165
2021-11-26 16:15:11 +01:00
Sandro La Bruzzo
c9870c5122
code formatted
2021-10-19 15:24:59 +02:00
Claudio Atzori
e471f12d5e
hotfix: recovered implementation removing the hardcoded working_dirs
2021-10-19 12:35:38 +02:00
Claudio Atzori
14fbf92ad6
Merge branch 'beta' into beta_solr_config
2021-10-14 11:08:44 +02:00
Sandro La Bruzzo
5606014b17
code refactor see ticket #7065
2021-10-12 08:11:53 +02:00
Claudio Atzori
2f61054cd1
code formatting
2021-10-11 18:29:42 +02:00
Serafeim Chatzopoulos
201ce71cc1
Add resultsubject, relprojectname and resultacceptanceyear to __all field
2021-10-11 13:16:39 +03:00
Serafeim Chatzopoulos
e468a7b96b
Add tests to query Solr with different configurations
2021-10-08 16:58:51 +03:00
Serafeim Chatzopoulos
de81007302
Add exploreTestConfig, a new Solr configuration folder
2021-10-08 16:54:56 +03:00
Alessia Bardi
8d3b60f446
test for patching records for EOSC Future
2021-10-07 17:30:45 +02:00
Alessia Bardi
b924276e18
tests to generate records for the EOSC-Future demo with the EOSC Jupyter Notebbok subject
2021-09-24 17:11:56 +02:00
Sandro La Bruzzo
d4dadf6d77
reduced max number of PID in Relatedentity
2021-09-02 14:21:24 +02:00
Sandro La Bruzzo
9f8a80deb7
fixed wrong import of unresolved relation in openaire
2021-09-01 14:16:27 +02:00
Alessia Bardi
3762b17f7b
added VERSIOn and PART relationship and re-ordered according to my personal and obviously possibly biased
...
ordering
2021-08-31 20:20:05 +02:00
Alessia Bardi
931f430129
Merge branch 'beta' into datasource_model_eosc_beta
2021-08-23 11:57:21 +02:00
Claudio Atzori
9f4db73f30
updated/fixed unit tests
2021-08-11 15:02:51 +02:00
Claudio Atzori
2ee21da43b
suggestions from SonarLint
2021-08-11 12:13:22 +02:00
Sandro La Bruzzo
6358f92c3a
added sleep to solve problem of lost request of creating index
2021-07-30 08:54:37 +02:00
Claudio Atzori
c53d106e80
[provision] lowercase relation filter
2021-07-29 13:57:00 +02:00
Sandro La Bruzzo
3721df7aa6
refactoring create actionset of scholexplorer, moved on package dhp-aggregation
2021-07-29 10:45:35 +02:00
Sandro La Bruzzo
3d8f0f629b
implemented workflow of creation action set for scholexplorer
2021-07-28 16:15:34 +02:00
Alessia Bardi
df8715a1ec
format code after mvn compile
2021-07-28 11:58:26 +02:00
Michele Artini
3e2a2d6e71
added new fields in xml
2021-07-28 11:56:55 +02:00
Alessia Bardi
c806387d4b
tests for enermaps
2021-07-28 11:54:36 +02:00
Claudio Atzori
2fff24df55
code formatting
2021-07-28 11:34:19 +02:00
Sandro La Bruzzo
16c91203bd
implemented workflow of creation action set for scholexplorer
2021-07-28 10:30:49 +02:00
Michele Artini
52e2315ba2
removed trick for datasourcetypeui
2021-07-28 10:23:00 +02:00
Claudio Atzori
10d7b4f0b4
filtering 'old' OpenAIRE ids from the entity.originalId[] array in the OAF -> XML searialization procedure
2021-07-20 11:52:05 +02:00