Miriam Baglioni
0fb6af5586
Updated main pom dependency against dhp-schema, from 8.0.1 to 9.0.0. The new fields included in the updated schema module are populated by the Solr JSON payload mapping, which also limits the number of authors serialised to 200.
2024-10-25 16:28:50 +02:00
Giambattista Bloisi
6bc741715c
Fix OafMapperUtilsTest.testMergePubs
2024-10-23 14:02:45 +02:00
Claudio Atzori
d5867a1992
merged #490
2024-10-08 15:39:59 +02:00
Giambattista Bloisi
c45cae447a
Fix: invert the "natural" order when ordering by id lexicographically
2024-09-26 17:08:02 +02:00
Claudio Atzori
3fcafc7ed6
Merge pull request 'Latest institutions in monitor dbs' ( #472 ) from antonis.lempesis/dnet-hadoop:beta into beta
...
Reviewed-on: D-Net/dnet-hadoop#472
2024-09-26 09:49:01 +02:00
Claudio Atzori
535a7b99f1
the metadata collection plugins using the HttpConnector2 class shall now retry instead of failing in case of UnknownHostException
2024-09-25 11:35:34 +02:00
Claudio Atzori
bfd05cdab2
run mergeResultsOfDifferentTypes only when checkDelegatedAuthority is true
2024-09-17 10:49:32 +02:00
Claudio Atzori
5aa7847ea6
consider the transformative agreement text when merging results
2024-07-16 10:38:50 +02:00
Claudio Atzori
1180d78b71
make entity level pids unique by pidType:pidValue
2024-07-04 09:41:12 +02:00
Claudio Atzori
7d3292551b
ignore dates containing 'null's
2024-07-02 15:44:31 +02:00
Lampros Smyrnaios
fe2275a9b0
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into convert_hive_to_spark_actions
...
# Conflicts:
# dhp-workflows/dhp-stats-update/src/main/resources/eu/dnetlib/dhp/oa/graph/stats/oozie_app/scripts/step14.sql
2024-06-25 20:17:47 +03:00
Claudio Atzori
a8d68c9d29
avoid NPEs
2024-06-11 14:19:24 +02:00
Claudio Atzori
ce2364743a
applying changes from PR#442: Fix for missing collectedfrom after dedup
2024-06-06 10:43:43 +02:00
Claudio Atzori
f70dc76b61
minor
2024-06-06 10:43:10 +02:00
Lampros Smyrnaios
a644a6f4fe
Catch Spark-sql errors and show a log with the statement that failed.
2024-05-29 12:10:11 +03:00
Giambattista Bloisi
73316d8c83
Add jaxb and jaxws dependencies when compiling with spark-34 profile as they are required to run with jdk > 8
2024-05-28 14:14:51 +02:00
Sandro La Bruzzo
f1fe363b19
merged again from beta (I hope for the last time)
2024-05-22 11:08:52 +02:00
Sandro La Bruzzo
66c1ffc866
merged again from beta (I hope for the last time)
2024-05-22 11:02:46 +02:00
Sandro La Bruzzo
103e2652b3
merged beta
2024-05-17 14:43:07 +02:00
Sandro La Bruzzo
6efab4d88e
fixed scholexplorer bug
2024-05-16 16:19:18 +02:00
Claudio Atzori
a5d13d5d27
code formatting
2024-05-03 14:14:34 +02:00
Giambattista Bloisi
69c5efbd8b
Fix: when applying enrichments with no instance information the resulting merge entity was generated with no instance instead of keeping the original information
2024-05-03 13:57:56 +02:00
Sandro La Bruzzo
db358ad0d2
code formatted
2024-05-02 15:25:57 +02:00
Sandro La Bruzzo
26bf8e763a
merged from beta
2024-05-02 15:20:23 +02:00
Sandro La Bruzzo
0646d0d064
Updated main sparkApplication to avoid to require master variable
2024-05-02 15:15:03 +02:00
Claudio Atzori
4355f64810
reverted to version 1.2.5-SNAPSHOT
2024-05-02 11:23:53 +02:00
Claudio Atzori
66680b8b9a
refactoring of common utilities
2024-05-02 11:16:58 +02:00
Claudio Atzori
dcf23b3d06
Merge branch 'beta' into beta-release-1.2.5
2024-05-02 10:01:49 +02:00
Sandro La Bruzzo
9cd3bc0f10
Added a new generation of the dump for scholexplorer tested with last version of spark, and strongly refactored
2024-04-26 16:02:07 +02:00
Claudio Atzori
e2937db385
Merge branch 'beta' into misc_fixes_merge_entities
2024-04-24 08:55:28 +02:00
Giambattista Bloisi
1878199dae
Miscellaneous fixes:
...
- in Merge By ID pick by preference those records coming from delegated Authorities
- fix various tests
- close spark session in SparkCreateSimRels
2024-04-24 08:12:45 +02:00
Sandro La Bruzzo
0d628cd62b
merged again from beta
2024-04-23 17:34:55 +02:00
Claudio Atzori
c3053ef34d
using version 1.2.5-beta for the release
2024-04-23 14:52:32 +02:00
Claudio Atzori
b5bcab13ec
using version 1.2.5-beta for the release
2024-04-23 14:36:39 +02:00
Claudio Atzori
425c9afc36
using version 1.2.5-beta for the release
2024-04-23 14:30:04 +02:00
Claudio Atzori
24a83fc24f
avoid NPEs in common Oaf merge utilities
2024-04-22 11:39:44 +02:00
Claudio Atzori
5857fd38c1
avoid NPEs in common Oaf merge utilities
2024-04-21 08:29:09 +02:00
Sandro La Bruzzo
b84ad0c06e
merged beta
2024-04-19 14:39:59 +02:00
Sandro La Bruzzo
342cb6189b
fixed problem on changed signature on RowEncoder
...
removed property dhp.schema.artifact
2024-04-19 12:13:26 +02:00
Claudio Atzori
ac8747582c
Merge branch 'beta' into doidoost_dismiss
2024-04-17 12:01:01 +02:00
Giambattista Bloisi
8ac167e420
Refinements to PR #404 : refactoring the Oaf records merge utilities into dhp-common
2024-04-16 17:18:28 +02:00
Miriam Baglioni
9eeb9f5d32
mergin with branch beta
2024-04-16 15:24:40 +02:00
Giambattista Bloisi
da333e9f4d
Merge pull request 'Enhance Dedup authors matching with algorithms used for ORCID enhancements (task 9690)' ( #419 ) from dedup_authorsmatch_bytoken into beta
...
Reviewed-on: D-Net/dnet-hadoop#419
2024-04-16 10:24:11 +02:00
Claudio Atzori
d070db4a32
added a couple more invalid author names
2024-04-16 09:41:59 +02:00
Giambattista Bloisi
43b454399f
- Bug fix in matchOrderedTokenAndAbbreviations algorithms where tokens with same initial character were always considered equal
...
- AuthorsMatch exploits the new matching strategy used for ORCID enhancements in #PR398: split author names in tokens, order the tokens, then check for matches of ordered full tokens or abbreviations
2024-04-15 18:19:29 +02:00
Sandro La Bruzzo
843dc95340
resolved conflict
2024-04-11 17:38:16 +02:00
Sandro La Bruzzo
2581672c11
updated wf of MAG and crossref to use transaction
2024-04-11 17:27:49 +02:00
Claudio Atzori
ecff0b4825
merge from beta
2024-03-25 16:15:52 +01:00
Claudio Atzori
82fc609c4f
Merge branch 'beta' into index_records
2024-03-25 16:12:49 +01:00
Claudio Atzori
9fc70a9451
implemented default merge procedure applied to result.instance
2024-03-25 15:39:14 +01:00