Claudio Atzori
9486e21a44
copy or process the person records throughout the graph pipeline
2024-07-30 14:25:31 +02:00
Claudio Atzori
5aa7847ea6
consider the transformative agreement text when merging results
2024-07-16 10:38:50 +02:00
Claudio Atzori
1180d78b71
make entity level pids unique by pidType:pidValue
2024-07-04 09:41:12 +02:00
Claudio Atzori
7d3292551b
ignore dates containing 'null's
2024-07-02 15:44:31 +02:00
Claudio Atzori
a8d68c9d29
avoid NPEs
2024-06-11 14:19:24 +02:00
Claudio Atzori
ce2364743a
applying changes from PR#442: Fix for missing collectedfrom after dedup
2024-06-06 10:43:43 +02:00
Claudio Atzori
f70dc76b61
minor
2024-06-06 10:43:10 +02:00
Giambattista Bloisi
73316d8c83
Add jaxb and jaxws dependencies when compiling with spark-34 profile as they are required to run with jdk > 8
2024-05-28 14:14:51 +02:00
Sandro La Bruzzo
f1fe363b19
merged again from beta (I hope for the last time)
2024-05-22 11:08:52 +02:00
Sandro La Bruzzo
66c1ffc866
merged again from beta (I hope for the last time)
2024-05-22 11:02:46 +02:00
Sandro La Bruzzo
103e2652b3
merged beta
2024-05-17 14:43:07 +02:00
Sandro La Bruzzo
6efab4d88e
fixed scholexplorer bug
2024-05-16 16:19:18 +02:00
Claudio Atzori
a5d13d5d27
code formatting
2024-05-03 14:14:34 +02:00
Giambattista Bloisi
69c5efbd8b
Fix: when applying enrichments with no instance information the resulting merge entity was generated with no instance instead of keeping the original information
2024-05-03 13:57:56 +02:00
Sandro La Bruzzo
db358ad0d2
code formatted
2024-05-02 15:25:57 +02:00
Sandro La Bruzzo
26bf8e763a
merged from beta
2024-05-02 15:20:23 +02:00
Sandro La Bruzzo
0646d0d064
Updated main sparkApplication to avoid to require master variable
2024-05-02 15:15:03 +02:00
Claudio Atzori
4355f64810
reverted to version 1.2.5-SNAPSHOT
2024-05-02 11:23:53 +02:00
Claudio Atzori
66680b8b9a
refactoring of common utilities
2024-05-02 11:16:58 +02:00
Claudio Atzori
dcf23b3d06
Merge branch 'beta' into beta-release-1.2.5
2024-05-02 10:01:49 +02:00
Sandro La Bruzzo
9cd3bc0f10
Added a new generation of the dump for scholexplorer tested with last version of spark, and strongly refactored
2024-04-26 16:02:07 +02:00
Claudio Atzori
e2937db385
Merge branch 'beta' into misc_fixes_merge_entities
2024-04-24 08:55:28 +02:00
Giambattista Bloisi
1878199dae
Miscellaneous fixes:
...
- in Merge By ID pick by preference those records coming from delegated Authorities
- fix various tests
- close spark session in SparkCreateSimRels
2024-04-24 08:12:45 +02:00
Sandro La Bruzzo
0d628cd62b
merged again from beta
2024-04-23 17:34:55 +02:00
Claudio Atzori
c3053ef34d
using version 1.2.5-beta for the release
2024-04-23 14:52:32 +02:00
Claudio Atzori
b5bcab13ec
using version 1.2.5-beta for the release
2024-04-23 14:36:39 +02:00
Claudio Atzori
425c9afc36
using version 1.2.5-beta for the release
2024-04-23 14:30:04 +02:00
Claudio Atzori
24a83fc24f
avoid NPEs in common Oaf merge utilities
2024-04-22 11:39:44 +02:00
Claudio Atzori
5857fd38c1
avoid NPEs in common Oaf merge utilities
2024-04-21 08:29:09 +02:00
Sandro La Bruzzo
b84ad0c06e
merged beta
2024-04-19 14:39:59 +02:00
Sandro La Bruzzo
342cb6189b
fixed problem on changed signature on RowEncoder
...
removed property dhp.schema.artifact
2024-04-19 12:13:26 +02:00
Claudio Atzori
ac8747582c
Merge branch 'beta' into doidoost_dismiss
2024-04-17 12:01:01 +02:00
Giambattista Bloisi
8ac167e420
Refinements to PR #404 : refactoring the Oaf records merge utilities into dhp-common
2024-04-16 17:18:28 +02:00
Miriam Baglioni
9eeb9f5d32
mergin with branch beta
2024-04-16 15:24:40 +02:00
Giambattista Bloisi
da333e9f4d
Merge pull request 'Enhance Dedup authors matching with algorithms used for ORCID enhancements (task 9690)' ( #419 ) from dedup_authorsmatch_bytoken into beta
...
Reviewed-on: #419
2024-04-16 10:24:11 +02:00
Claudio Atzori
d070db4a32
added a couple more invalid author names
2024-04-16 09:41:59 +02:00
Giambattista Bloisi
43b454399f
- Bug fix in matchOrderedTokenAndAbbreviations algorithms where tokens with same initial character were always considered equal
...
- AuthorsMatch exploits the new matching strategy used for ORCID enhancements in #PR398: split author names in tokens, order the tokens, then check for matches of ordered full tokens or abbreviations
2024-04-15 18:19:29 +02:00
Sandro La Bruzzo
843dc95340
resolved conflict
2024-04-11 17:38:16 +02:00
Sandro La Bruzzo
2581672c11
updated wf of MAG and crossref to use transaction
2024-04-11 17:27:49 +02:00
Claudio Atzori
ecff0b4825
merge from beta
2024-03-25 16:15:52 +01:00
Claudio Atzori
82fc609c4f
Merge branch 'beta' into index_records
2024-03-25 16:12:49 +01:00
Claudio Atzori
9fc70a9451
implemented default merge procedure applied to result.instance
2024-03-25 15:39:14 +01:00
Giambattista Bloisi
3f22c101d9
Merge pull request 'Enrich authors with ORCID info using new matching algorithm' ( #398 ) from new_orcid_enhancement into beta
...
Reviewed-on: #398
2024-03-22 17:29:20 +01:00
Claudio Atzori
aaa73f89d1
refactoring the Oaf records merge utilities into dhp-common
2024-03-22 16:34:03 +01:00
Sandro La Bruzzo
58dbe71d39
update crossref mapping to be runnable separately as a single datasource outside doiboost
2024-03-20 17:04:52 +01:00
Giambattista Bloisi
664a381d31
Unify merge logic of entities in MergeUtils.class
2024-03-18 16:04:49 +01:00
Claudio Atzori
af154d4456
implemented changes from #9497 : sort abstracts by string length, included author fullnames in the related results, expanded instance details within each children/result XML element
2024-03-14 16:21:23 +01:00
Sandro La Bruzzo
c532831718
Moved Crossref Mapping on dhp-aggregations,
...
refactored code, avoid to use utility for create part of the oaf defined in DOIBoostMappingUtils, used instead utility in OafMappingUtils
2024-03-13 06:56:10 +01:00
Giambattista Bloisi
9092075760
Enrich authors with ORCID info using new matching algorithm
2024-03-11 13:23:59 +01:00
Claudio Atzori
bb82052c40
[graph cleaning] rule out datasources without an officialname
2024-02-05 14:59:27 +02:00