Claudio Atzori
73bd1938a5
[graph2hive] use sparkExecutorMemory to define also the memoryOverhead
2024-06-05 12:17:35 +02:00
Sandro La Bruzzo
103e2652b3
merged beta
2024-05-17 14:43:07 +02:00
Sandro La Bruzzo
a87f9ea643
fixed scholexplorer bug
2024-05-17 14:16:43 +02:00
Sandro La Bruzzo
6efab4d88e
fixed scholexplorer bug
2024-05-16 16:19:18 +02:00
Claudio Atzori
0486227185
[cleaning] deactivating the cleaning of FOS subjects found in the metadata provided by repositories
2024-05-03 14:31:12 +02:00
Sandro La Bruzzo
26bf8e763a
merged from beta
2024-05-02 15:20:23 +02:00
Sandro La Bruzzo
0646d0d064
Updated main sparkApplication to avoid to require master variable
2024-05-02 15:15:03 +02:00
Claudio Atzori
4355f64810
reverted to version 1.2.5-SNAPSHOT
2024-05-02 11:23:53 +02:00
Claudio Atzori
66680b8b9a
refactoring of common utilities
2024-05-02 11:16:58 +02:00
Claudio Atzori
dcf23b3d06
Merge branch 'beta' into beta-release-1.2.5
2024-05-02 10:01:49 +02:00
Sandro La Bruzzo
133ead1e3e
updated new version of scholexplorer Generation
2024-04-29 09:00:30 +02:00
Sandro La Bruzzo
9cd3bc0f10
Added a new generation of the dump for scholexplorer tested with last version of spark, and strongly refactored
2024-04-26 16:02:07 +02:00
Giambattista Bloisi
1878199dae
Miscellaneous fixes:
...
- in Merge By ID pick by preference those records coming from delegated Authorities
- fix various tests
- close spark session in SparkCreateSimRels
2024-04-24 08:12:45 +02:00
Claudio Atzori
c3053ef34d
using version 1.2.5-beta for the release
2024-04-23 14:52:32 +02:00
Claudio Atzori
b5bcab13ec
using version 1.2.5-beta for the release
2024-04-23 14:36:39 +02:00
Claudio Atzori
425c9afc36
using version 1.2.5-beta for the release
2024-04-23 14:30:04 +02:00
Claudio Atzori
0656ab2838
code formatting
2024-04-20 08:10:58 +02:00
Claudio Atzori
ab7f0855af
fixed query reading projects from the aggregator DB
2024-04-20 08:10:32 +02:00
Giambattista Bloisi
8ac167e420
Refinements to PR #404 : refactoring the Oaf records merge utilities into dhp-common
2024-04-16 17:18:28 +02:00
Giambattista Bloisi
43b454399f
- Bug fix in matchOrderedTokenAndAbbreviations algorithms where tokens with same initial character were always considered equal
...
- AuthorsMatch exploits the new matching strategy used for ORCID enhancements in #PR398: split author names in tokens, order the tokens, then check for matches of ordered full tokens or abbreviations
2024-04-15 18:19:29 +02:00
Claudio Atzori
ef52128c55
included new stats* workflows in parent pom list of modules, code formatting
2024-03-26 10:42:10 +01:00
Claudio Atzori
bfba71a95c
further follow up changes from integrating the mergeutils branch
2024-03-26 09:01:18 +01:00
Claudio Atzori
538b180fe0
Merge branch 'beta' into oaf_country_beta
2024-03-25 16:13:20 +01:00
Giambattista Bloisi
3f22c101d9
Merge pull request 'Enrich authors with ORCID info using new matching algorithm' ( #398 ) from new_orcid_enhancement into beta
...
Reviewed-on: D-Net/dnet-hadoop#398
2024-03-22 17:29:20 +01:00
Giambattista Bloisi
0ff7faad72
Fix conditions that prevented ORCID Enrichment
2024-03-22 16:24:49 +01:00
Giambattista Bloisi
664a381d31
Unify merge logic of entities in MergeUtils.class
2024-03-18 16:04:49 +01:00
Michele Artini
30167aa882
mapped oaf:country from results
2024-03-15 11:24:16 +01:00
Giambattista Bloisi
9092075760
Enrich authors with ORCID info using new matching algorithm
2024-03-11 13:23:59 +01:00
Sandro La Bruzzo
7d806a434c
formatted code
2024-02-28 09:31:58 +01:00
Michele Artini
3268570b2c
mapping of project PIDs
2024-02-22 14:47:21 +01:00
Claudio Atzori
a63b091bae
Merge branch 'beta' into import_orps_fix
2024-02-15 15:01:56 +01:00
Claudio Atzori
d85d2df6ad
[graph raw] fixed mapping of the original resource type from the Datacite format
2024-02-09 10:20:20 +01:00
Claudio Atzori
38c9001147
fixed import of ORPs stored on HDFS in the internal graph format (e.g. Datacite)
2024-02-07 17:02:05 +01:00
Claudio Atzori
42f5506306
[orcid enrichment] fixed directory cleanup before distcp
2024-02-05 09:45:36 +02:00
Alessia Bardi
f2a08d8cc2
test for Italian records from IRS repositories
2024-01-30 19:20:14 +01:00
Claudio Atzori
2655eea5bc
[orcid enrichment] drop paths before copying the non-modifyed contents
2024-01-19 16:28:05 +01:00
Claudio Atzori
cb9e739484
Merge branch 'beta' into resource_types
2024-01-11 16:29:41 +01:00
Claudio Atzori
2753044d13
refined mapping for the extraction of the original resource type
2024-01-11 16:28:26 +01:00
Miriam Baglioni
e711a05229
fixed conflicts
2024-01-10 11:03:42 +01:00
Claudio Atzori
62104790ae
added metaresourcetype to the result hive DB view
2023-12-21 12:27:10 +01:00
Miriam Baglioni
4740c808f7
-
2023-12-20 14:26:54 +01:00
Claudio Atzori
cb71a7936b
[graph cleaning] avoid stack overflow error when navigating Oaf objects declaring an Enum
2023-12-07 23:09:54 +01:00
Claudio Atzori
259c69e446
[orcid enrichment] fixed workflow definition
2023-12-06 19:41:53 +01:00
Claudio Atzori
2a233a89aa
[graph grouping] added isLookupUrl to the workflow definition, passed to the grouping spark aciton
2023-12-03 13:32:52 +01:00
Claudio Atzori
622fafbd2e
Merge branch 'beta' into orcid_import
2023-12-01 12:28:14 +01:00
Sandro La Bruzzo
bf0fd27c36
Removed unused function
...
Applied PR Comment of Giambattista in the PR
2023-12-01 12:16:42 +01:00
Sandro La Bruzzo
cdfb7588dd
code formatting
2023-11-30 15:31:42 +01:00
Sandro La Bruzzo
5e22b67b8a
Merge remote-tracking branch 'origin/beta' into orcid_import
2023-11-30 15:27:46 +01:00
Sandro La Bruzzo
f718caaac9
Added copy of the untouched entities of the graph
2023-11-30 14:51:00 +01:00
Sandro La Bruzzo
7b5e04f37e
removed Orcid intersection on DOIBoost
2023-11-30 14:36:50 +01:00