Claudio Atzori
8d2bb24512
merged from master
2021-03-08 15:44:34 +01:00
Alessia Bardi
505477f36f
format code
2021-01-25 18:02:49 +01:00
Alessia Bardi
ded6ed8d7d
no ',' author, if there are no author in ODF records
2021-01-25 17:57:51 +01:00
Claudio Atzori
28460c2cd1
using com.fasterxml.jackson.databind.ObjectMapper instead of org.codehaus.jackson.map.ObjectMapper
2020-12-23 16:59:52 +01:00
Michele Artini
991e675dc6
validation in claim rels
2020-12-14 15:41:25 +01:00
Michele Artini
1bc9adc10d
default trust for validated rels
2020-12-09 16:18:37 +01:00
Michele Artini
5f21a356fd
reindent
2020-12-09 11:24:30 +01:00
Michele Artini
370a5e650b
validation attributes in resultProject relations
2020-12-09 11:18:26 +01:00
Claudio Atzori
2c407e775e
GenerateEntitiesApplication can be configured to hash the id value or not
2020-11-30 12:00:38 +01:00
Claudio Atzori
c1b9a4045a
grouping of records will be performed by the dedup workflow
2020-11-26 10:59:10 +01:00
Claudio Atzori
e1a1bb3ee4
moved class CleaningFunctions in the correct package. Remove newlines from titles, descriptions, subjects
2020-11-24 18:34:03 +01:00
Claudio Atzori
3f34757c63
merged from master
2020-11-19 14:34:54 +01:00
Alessia Bardi
10e673660f
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
2020-11-18 10:01:23 +01:00
Alessia Bardi
be7b310cef
rel semantcis ignore case
2020-11-18 10:01:20 +01:00
Michele Artini
33da2e3d6c
xpaths for dateOfCollection and dateOfTransformation
2020-11-18 09:26:20 +01:00
Alessia Bardi
8f87020a50
#56 : map relevantDates from aggregated ODF records
2020-11-17 18:42:09 +01:00
Claudio Atzori
768bc5304c
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
2020-11-13 15:40:34 +01:00
Claudio Atzori
93f7b7974f
Merge pull request 'trust truncated to 3 decimals' ( #24 ) from trunc_trust into master
...
LGTM
2020-11-13 15:40:02 +01:00
Claudio Atzori
2bed29eb09
WIP: added oozie workflow for grouping graph entities by id
2020-11-13 10:05:12 +01:00
Claudio Atzori
13e36a4da0
WIP: added oozie workflow for grouping graph entities by id
2020-11-13 10:05:02 +01:00
Claudio Atzori
9b0fb9e958
merged from master
2020-11-12 09:27:12 +01:00
Michele Artini
40160d171f
organizations pids
2020-11-09 12:58:36 +01:00
Claudio Atzori
2d76497488
cleanup
2020-11-05 17:10:24 +01:00
Claudio Atzori
86d6fbe95b
refactoring: CleaningFunctions and OafMapperUtils moved in dhp-commong
2020-11-03 12:19:46 +01:00
Claudio Atzori
3fcd669e99
result merge operation leverage on custom ResultTypeComparator in the aggregator graph construction
2020-11-03 10:53:23 +01:00
Claudio Atzori
58f28296ea
ProvisionConstants moved as ModelHardLimits in dhp-common and applied to truncate long abstracts (len > 150000). Further filtering for empty PID values
2020-10-30 10:56:42 +01:00
Claudio Atzori
266bf1a221
common IdentifierFactory in use on the mapping from the aggregator data; merge the entities sharing the same id; code formatting
2020-10-16 17:02:10 +02:00
Claudio Atzori
34f1d0904b
common IdentifierFactory in use on the mapping from the aggregator data
2020-10-16 16:00:19 +02:00
Claudio Atzori
49ae3450a9
code formatting
2020-10-02 09:43:24 +02:00
Claudio Atzori
c2a6e2a9bf
fixed mapping for datasource journal info (ISSNs)
2020-10-02 09:37:08 +02:00
Claudio Atzori
9e3e93c6b6
setting the correct issn type in the datasource.journal element
2020-09-24 10:39:16 +02:00
Alessia Bardi
a29565ff57
code formatting
2020-08-04 12:55:27 +02:00
Alessia Bardi
01db29e208
fixes redmine issue #5846 : datacite and its different namespace declarations
2020-08-04 12:53:48 +02:00
Alessia Bardi
b4e4e5f858
do not duplicate result PIDs
2020-08-04 12:52:14 +02:00
Michele Artini
bdece15ca0
blacklist of nsprefix
2020-07-30 16:13:38 +02:00
Claudio Atzori
ebf60020ac
map results as OPRs in case of missing //CobjCategory/@type and the vocabulary dnet:result_typologies doesn't resolve the super type
2020-07-20 19:01:10 +02:00
Claudio Atzori
32f5e466e3
imports cleanup
2020-07-20 17:42:58 +02:00
Claudio Atzori
54ac583923
code formatting
2020-07-20 17:37:08 +02:00
Claudio Atzori
124e7ce19c
in case of missing attribute //dr:CobjCategory/@type the resulttype is derived by looking up the vocabulary dnet:result_typologies with the 1st instance type available
2020-07-20 17:33:37 +02:00
Claudio Atzori
050dda223d
Merge pull request 'removed duplicated fields' ( #25 ) from unique_field_in_lists into master
...
Looks good as a temporary workaround. I agree the model could seamlessly make the distinct operation by using HashSets instead of Linked (or Array) Lists.
The task to update the model in such a way is added on #9#issuecomment-1583
Thanks!
2020-07-20 12:12:50 +02:00
Michele Artini
331a3cbdd0
fixed originalId
2020-07-20 09:50:29 +02:00
Michele Artini
442f30930c
removed duplicated fields
2020-07-17 12:25:36 +02:00
Michele Artini
3adedd0a68
trust truncated to 3 decimals
2020-07-17 11:58:11 +02:00
Claudio Atzori
67e1d222b6
bulk cleaning when found null or empty, sets bestaccessrights evaluating the result instances
2020-07-08 17:53:35 +02:00
Michele Artini
abcbebcbb4
fixed generation of ids
2020-06-25 09:50:46 +02:00
Michele Artini
77d2a1b1c4
params to choose sql queries for beta or production
2020-06-25 09:28:13 +02:00
Claudio Atzori
d0ac7514b2
cleaning workflow to include cleaning of default values
2020-06-18 19:37:25 +02:00
Claudio Atzori
5441f01586
Merge pull request 'missing landingPage urls in instances' ( #22 ) from instances-with-landing-page into master
...
Looks good, thanks!
2020-06-16 15:32:44 +02:00
Claudio Atzori
2a4f65795f
WIP: graph cleaner implementation
2020-06-15 18:32:24 +02:00
Claudio Atzori
c15c8c0ad0
map datasource identities (including piwik ids) as original IDs
2020-06-15 16:07:30 +02:00