Claudio Atzori
c4e8aaca1f
PidCleaner used pervasively
2024-10-08 14:58:28 +02:00
Claudio Atzori
6e0b6a886f
code formatting
2024-09-30 15:13:23 +02:00
Claudio Atzori
4e9f64e01a
merged from the osfPreprints_plugin branch
2024-09-30 11:24:17 +02:00
Giambattista Bloisi
d175a9745f
Fix: invert the "natural" order when ordering by id lexicographically
2024-09-26 17:07:31 +02:00
Claudio Atzori
81bfe3fe32
WIP merged beta into main
2024-09-26 09:23:44 +02:00
Claudio Atzori
c1a309df75
Merge pull request 'retry on UnknownHostException' ( #469 ) from retry_on_UnknownHostException into main
...
Reviewed-on: #469
2024-09-25 11:35:14 +02:00
Claudio Atzori
bfd05cdab2
run mergeResultsOfDifferentTypes only when checkDelegatedAuthority is true
2024-09-17 10:49:32 +02:00
Claudio Atzori
c648531ccb
run mergeResultsOfDifferentTypes only when checkDelegatedAuthority is true
2024-09-16 16:16:23 +02:00
Claudio Atzori
6b4fa7b8b9
the metadata collection plugins using the HttpConnector2 class shall now retry instead of failing in case of UnknownHostException
2024-08-05 16:55:07 +02:00
Claudio Atzori
5aa7847ea6
consider the transformative agreement text when merging results
2024-07-16 10:38:50 +02:00
Claudio Atzori
153b56eeff
make entity level pids unique by pidType:pidValue
2024-07-04 09:41:39 +02:00
Claudio Atzori
1180d78b71
make entity level pids unique by pidType:pidValue
2024-07-04 09:41:12 +02:00
Claudio Atzori
7b398a6d0b
updated import of organization types from OpenOrgs
2024-07-03 11:11:35 +02:00
Claudio Atzori
7d3292551b
ignore dates containing 'null's
2024-07-02 15:44:31 +02:00
Claudio Atzori
c06dfdfd86
ignore dates containing 'null's
2024-07-02 15:43:11 +02:00
Claudio Atzori
a8d68c9d29
avoid NPEs
2024-06-11 14:19:24 +02:00
Claudio Atzori
71927ca818
avoid NPEs
2024-06-11 12:40:50 +02:00
Giambattista Bloisi
46018dc804
Fix OperationUnsupportedException while merging two Result's contexts due to modification of an immutable collection
2024-06-11 10:39:48 +02:00
Claudio Atzori
ce2364743a
applying changes from PR#442: Fix for missing collectedfrom after dedup
2024-06-06 10:43:43 +02:00
Claudio Atzori
f70dc76b61
minor
2024-06-06 10:43:10 +02:00
Giambattista Bloisi
3feab5d92d
Fix MergeUtils.mergeGroup: it could get rid of some records and did not consider all PID authorities whilke sorting records.
...
ResultTypeComparator is now renamed in MergeEntitiesComparator and can be used as a general comparator for merging groups of records
2024-06-03 15:13:40 +02:00
Claudio Atzori
a428e7be7e
graph cleaning to implement ugly hardcoded rules, avoid NPEs
2024-05-29 09:26:12 +02:00
Claudio Atzori
8e45c5baa8
graph cleaning to implement ugly hardcoded rules
2024-05-28 15:28:42 +02:00
Claudio Atzori
c3fe59bc78
fixed conflicts merging from beta, code formatting
2024-05-21 14:50:40 +02:00
Sandro La Bruzzo
103e2652b3
merged beta
2024-05-17 14:43:07 +02:00
Sandro La Bruzzo
6efab4d88e
fixed scholexplorer bug
2024-05-16 16:19:18 +02:00
Claudio Atzori
a5d13d5d27
code formatting
2024-05-03 14:14:34 +02:00
Giambattista Bloisi
69c5efbd8b
Fix: when applying enrichments with no instance information the resulting merge entity was generated with no instance instead of keeping the original information
2024-05-03 13:57:56 +02:00
Sandro La Bruzzo
db358ad0d2
code formatted
2024-05-02 15:25:57 +02:00
Sandro La Bruzzo
26bf8e763a
merged from beta
2024-05-02 15:20:23 +02:00
Sandro La Bruzzo
0646d0d064
Updated main sparkApplication to avoid to require master variable
2024-05-02 15:15:03 +02:00
Claudio Atzori
66680b8b9a
refactoring of common utilities
2024-05-02 11:16:58 +02:00
Sandro La Bruzzo
9cd3bc0f10
Added a new generation of the dump for scholexplorer tested with last version of spark, and strongly refactored
2024-04-26 16:02:07 +02:00
Claudio Atzori
e2937db385
Merge branch 'beta' into misc_fixes_merge_entities
2024-04-24 08:55:28 +02:00
Giambattista Bloisi
1878199dae
Miscellaneous fixes:
...
- in Merge By ID pick by preference those records coming from delegated Authorities
- fix various tests
- close spark session in SparkCreateSimRels
2024-04-24 08:12:45 +02:00
Sandro La Bruzzo
0d628cd62b
merged again from beta
2024-04-23 17:34:55 +02:00
Claudio Atzori
24a83fc24f
avoid NPEs in common Oaf merge utilities
2024-04-22 11:39:44 +02:00
Claudio Atzori
5857fd38c1
avoid NPEs in common Oaf merge utilities
2024-04-21 08:29:09 +02:00
Sandro La Bruzzo
b84ad0c06e
merged beta
2024-04-19 14:39:59 +02:00
Claudio Atzori
ac8747582c
Merge branch 'beta' into doidoost_dismiss
2024-04-17 12:01:01 +02:00
Giambattista Bloisi
8ac167e420
Refinements to PR #404 : refactoring the Oaf records merge utilities into dhp-common
2024-04-16 17:18:28 +02:00
Miriam Baglioni
9eeb9f5d32
mergin with branch beta
2024-04-16 15:24:40 +02:00
Giambattista Bloisi
da333e9f4d
Merge pull request 'Enhance Dedup authors matching with algorithms used for ORCID enhancements (task 9690)' ( #419 ) from dedup_authorsmatch_bytoken into beta
...
Reviewed-on: #419
2024-04-16 10:24:11 +02:00
Claudio Atzori
d070db4a32
added a couple more invalid author names
2024-04-16 09:41:59 +02:00
Giambattista Bloisi
43b454399f
- Bug fix in matchOrderedTokenAndAbbreviations algorithms where tokens with same initial character were always considered equal
...
- AuthorsMatch exploits the new matching strategy used for ORCID enhancements in #PR398: split author names in tokens, order the tokens, then check for matches of ordered full tokens or abbreviations
2024-04-15 18:19:29 +02:00
Sandro La Bruzzo
843dc95340
resolved conflict
2024-04-11 17:38:16 +02:00
Sandro La Bruzzo
2581672c11
updated wf of MAG and crossref to use transaction
2024-04-11 17:27:49 +02:00
Claudio Atzori
ecff0b4825
merge from beta
2024-03-25 16:15:52 +01:00
Claudio Atzori
82fc609c4f
Merge branch 'beta' into index_records
2024-03-25 16:12:49 +01:00
Claudio Atzori
9fc70a9451
implemented default merge procedure applied to result.instance
2024-03-25 15:39:14 +01:00