Claudio Atzori
|
34abd0fc43
|
Merge branch 'beta' into clean_license_publisher
|
2023-12-08 16:58:27 +01:00 |
Claudio Atzori
|
7c3041b276
|
avoid NPEs
|
2023-12-03 16:49:49 +01:00 |
Claudio Atzori
|
74b185d07b
|
avoid NPEs
|
2023-12-03 16:18:20 +01:00 |
Claudio Atzori
|
4e1aac2e2f
|
resolved conflict in pom.xml before applying the changes from [COAR based resource types & Irish tender] #350
|
2023-11-29 14:37:52 +01:00 |
Claudio Atzori
|
1ba582de3c
|
[graph cleaning] added cleaning for result.publisher and result.instance.license
|
2023-11-23 16:27:19 +01:00 |
Claudio Atzori
|
11a1207f9c
|
[graph cleaning] applying coar based vocabularies in bulk
|
2023-11-22 12:22:14 +01:00 |
Claudio Atzori
|
dde2fec035
|
[graph cleaning] cleanup
|
2023-10-31 14:35:33 +01:00 |
Claudio Atzori
|
262d7c581b
|
[graph cleaning] implemented further suggestions from https://support.openaire.eu/issues/8898
|
2023-10-31 14:34:10 +01:00 |
Claudio Atzori
|
b0fed1725e
|
avoid NPEs
|
2023-10-19 12:13:45 +02:00 |
Claudio Atzori
|
39d24d5469
|
Merge branch 'beta' into resource_types
|
2023-10-16 11:56:38 +02:00 |
Claudio Atzori
|
05ee7d8b09
|
[graph cleaning] avoid NPEs
|
2023-10-12 09:13:42 +02:00 |
Claudio Atzori
|
554551682d
|
[raw graph] adopting the new COAR based vocabularies for the resource typing
|
2023-10-11 16:09:19 +02:00 |
Claudio Atzori
|
8108491722
|
Merge branch 'beta' into peer_reviewed
|
2023-10-06 14:21:52 +02:00 |
Giambattista Bloisi
|
2f3cf6d0e7
|
Fix cleaning of Pmid where parsing of numbers stopped at first not leading 0' character
|
2023-10-06 14:20:15 +02:00 |
Claudio Atzori
|
c9a5ad6a02
|
extending the coverage of the peer non-unknown refereed instances
|
2023-10-02 16:28:42 +02:00 |
Claudio Atzori
|
bf35280ea6
|
code formatting
|
2023-08-29 11:11:00 +02:00 |
Miriam Baglioni
|
c25ac21e5e
|
Merge pull request 'graph cleaning, suggestions from ticket 8898' (#325) from cleaning_8898 into beta
Reviewed-on: #325
|
2023-08-08 11:14:19 +02:00 |
Claudio Atzori
|
b9dddbfe54
|
rule out records with NULL dataInfo, except for Relations
|
2023-07-31 17:53:54 +02:00 |
Claudio Atzori
|
11ffb9bd68
|
rule out records with NULL dataInfo
|
2023-07-31 12:35:33 +02:00 |
Claudio Atzori
|
d8435a6512
|
inverted condition
|
2023-07-25 17:39:57 +02:00 |
Claudio Atzori
|
270df939c4
|
partial implementation of the suggestions from https://support.openaire.eu/issues/8898
|
2023-07-25 17:29:50 +02:00 |
Claudio Atzori
|
0f5a819f44
|
[graph cleaning] fixed regex behaviour for cleaning ROR and GRID identifiers, added tests
|
2023-06-23 16:10:49 +02:00 |
Claudio Atzori
|
1d33074fd1
|
WIP: pid cleaning
|
2023-06-09 16:47:25 +02:00 |
Claudio Atzori
|
2a6ba29b64
|
[graph cleaning] unit tests & cleanup
|
2023-04-04 12:34:51 +02:00 |
Claudio Atzori
|
6d3d18d8b5
|
[graph cleaning] WIP: refactoring of the cleaning stages
|
2023-03-16 17:23:36 +01:00 |
Claudio Atzori
|
9a03f71db1
|
code formatting
|
2023-02-13 16:25:47 +01:00 |
Claudio Atzori
|
9cf0a98699
|
[cleaning] set the common subject classid/name
|
2022-12-20 10:17:33 +01:00 |
Claudio Atzori
|
b8bafab8a0
|
[cleaning] improved vocabulary based mapping, specialization for the strict vocab cleaning
|
2022-12-12 14:43:03 +01:00 |
Claudio Atzori
|
b47aaf4dd1
|
[cleaning] subjects declared as belonging to specific vocabularies whose values are not found in the vocab are set to type keyword
|
2022-10-13 11:23:43 +02:00 |
Claudio Atzori
|
b7c387c21f
|
cleaning of subjects: avoid duplicated subjects, prioritise collected vs inferred or other sources
|
2022-08-12 15:09:16 +02:00 |
Claudio Atzori
|
3418ce50ac
|
cleaning of subjects: perform the cleaning when the given value is equivalent to one of the terms in the vocabulary
|
2022-08-08 12:48:47 +02:00 |
Claudio Atzori
|
27a91841e7
|
WIP: cleaning of subjects
|
2022-08-04 11:39:39 +02:00 |
Claudio Atzori
|
5130eac247
|
mapping by participant project contribution
|
2022-06-24 17:16:42 +02:00 |
Claudio Atzori
|
da611cfbbd
|
[eosc_services] resolved merge conflicts
|
2022-05-03 13:37:15 +02:00 |
Claudio Atzori
|
f5f532d134
|
EOSC Services - ongoing update
|
2022-04-29 12:25:24 +02:00 |
Miriam Baglioni
|
b61efd613b
|
[Measures] addressed comments in the PR
|
2022-04-21 12:09:37 +02:00 |
Miriam Baglioni
|
c304657d91
|
[Measures] put the logic in common, no need to change the schema
|
2022-04-21 11:27:26 +02:00 |
Claudio Atzori
|
db299dd8ab
|
fixed typo
|
2022-01-27 16:24:06 +01:00 |
Claudio Atzori
|
c42623f006
|
added NPE checks
|
2022-01-21 14:30:09 +01:00 |
Claudio Atzori
|
62f135262e
|
code formatting
|
2022-01-19 12:30:52 +01:00 |
Claudio Atzori
|
44a937f4ed
|
factored out entity grouping implementation, extended to consider results from delegated authorities rather than identical records from other sources
|
2022-01-19 12:24:52 +01:00 |
Miriam Baglioni
|
42e8f76778
|
[GraphCleaning] change the return value in the filtering function to avoid to lose the APC entities
|
2022-01-13 16:06:43 +01:00 |
Claudio Atzori
|
aff3ddc8d2
|
added cleaning for the format field, removing carrige return and tab characters
|
2021-12-14 11:41:46 +01:00 |
Claudio Atzori
|
41c70c607d
|
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
|
2021-12-09 16:44:28 +01:00 |
Claudio Atzori
|
863a2f9db3
|
avoid to filter OAF records defined as invisible = true
|
2021-12-03 09:08:12 +01:00 |
Claudio Atzori
|
82a4e4efae
|
[cleaning wf] fixed methodology to rule out invalid result titles, based on https://support.openaire.eu/issues/7206
|
2021-11-17 14:17:22 +01:00 |
Claudio Atzori
|
49f897ef29
|
[cleaning wf] fixed regex used to spot garbage in result titles; adjusted threshold for filtering titles
|
2021-11-16 15:24:23 +01:00 |
Claudio Atzori
|
2ee21da43b
|
suggestions from SonarLint
|
2021-08-11 12:13:22 +02:00 |
Claudio Atzori
|
6dddad86ee
|
[cleaning] title cleaning based on the me.xuender:unidecode library
|
2021-07-28 16:21:29 +02:00 |
Claudio Atzori
|
bc835d2024
|
[cleaning] fixed filtering function for missing titles
|
2021-07-23 11:56:13 +02:00 |