Commit Graph

671 Commits

Author SHA1 Message Date
Miriam Baglioni a9cc70d3b0 fixed issues on wf definition and proeprty name 2020-11-30 12:08:33 +01:00
Miriam Baglioni 02589717b0 change in wf definition to have the cleaning part follow the report part directly. Modification of the name of the properties. Adjustments in test classes 2020-11-30 12:00:26 +01:00
Miriam Baglioni 44a66ad8a6 first try in adding orcid emend after cleaning step 2020-11-27 17:28:29 +01:00
Miriam Baglioni 3d66a7c0d6 - 2020-11-25 17:41:29 +01:00
Miriam Baglioni 2f9ac993e0 merge branch with master 2020-11-25 14:42:06 +01:00
Claudio Atzori eeebd5a920 Cleanig workflow: remove newlines from titles, descriptions, subjects 2020-11-24 18:40:25 +01:00
Miriam Baglioni 33c27b6f75 - 2020-11-20 10:07:50 +01:00
Miriam Baglioni d08dca0745 merge branch with master 2020-11-19 19:17:06 +01:00
Miriam Baglioni cb3cb8df04 - 2020-11-18 18:11:27 +01:00
Claudio Atzori ede7fae6c8 Merge pull request 'XML record indexing test' (#58) from provision_indexing into master 2020-11-18 17:04:34 +01:00
Miriam Baglioni c702f8e6a3 added dependency for consrtaint computing, and added test 2020-11-18 14:16:48 +01:00
Miriam Baglioni f4fee8f43c changed upper bound for whitelist 2020-11-18 14:04:17 +01:00
Miriam Baglioni 96d50080b4 merge branch with master 2020-11-18 12:26:14 +01:00
Miriam Baglioni 0e407b5f23 new tests 2020-11-18 12:18:11 +01:00
Miriam Baglioni 07837e51a9 - 2020-11-18 12:16:00 +01:00
Miriam Baglioni 1f52649d13 changed workflow and added parameters (whitelist) 2020-11-18 12:14:58 +01:00
Miriam Baglioni 683275a5db added logic to filter out other possible right matches 2020-11-18 12:14:13 +01:00
Miriam Baglioni 2aa48a132a added alternative names on the report 2020-11-18 12:13:43 +01:00
Miriam Baglioni 8555082bb1 changed to allow and logic in constraints verification 2020-11-18 12:13:13 +01:00
Miriam Baglioni a4781ddf65 changed to consider and logic within constraints 2020-11-18 12:12:48 +01:00
Miriam Baglioni a708652093 added check for empty(null) values 2020-11-18 11:58:00 +01:00
Miriam Baglioni 457c07a522 added the ignore case for both the constraints 2020-11-18 11:56:40 +01:00
Claudio Atzori 8177ce7939 test for XmlIndexingJob based on a local miniSolrCluster 2020-11-18 10:58:05 +01:00
Alessia Bardi 10e673660f Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-11-18 10:01:23 +01:00
Alessia Bardi be7b310cef rel semantcis ignore case 2020-11-18 10:01:20 +01:00
Michele Artini 33da2e3d6c xpaths for dateOfCollection and dateOfTransformation 2020-11-18 09:26:20 +01:00
Alessia Bardi 8f87020a50 #56: map relevantDates from aggregated ODF records 2020-11-17 18:42:09 +01:00
Alessia Bardi 7e0a76a8ac test fr TextGrid 2020-11-17 18:39:25 +01:00
Claudio Atzori cfc01f136e PID filtering based on a blacklist 2020-11-17 12:27:06 +01:00
Miriam Baglioni ec5b5c3a23 added set of classes for the verification of constraints on the result. 2020-11-16 18:52:53 +01:00
Miriam Baglioni 19167c9b9d merge branch with master 2020-11-16 14:15:39 +01:00
Miriam Baglioni 0ad1e237e6 added check to avoid break when name/surname is only composed of the word dr 2020-11-16 14:15:03 +01:00
Miriam Baglioni c29d142087 - 2020-11-16 10:53:12 +01:00
Claudio Atzori 6ab1ce53c9 fixed condition in result pid cleaning; cleanup 2020-11-16 10:09:17 +01:00
Claudio Atzori 4de8c8b237 fixed workflow variable name 2020-11-16 10:03:11 +01:00
Claudio Atzori 331d621800 added test resource 2020-11-14 12:16:15 +01:00
Claudio Atzori 5d4e34e26a fixed typo in variable name 2020-11-14 10:32:26 +01:00
Claudio Atzori 768bc5304c Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-11-13 15:40:34 +01:00
Claudio Atzori 93f7b7974f Merge pull request 'trust truncated to 3 decimals' (#24) from trunc_trust into master
LGTM
2020-11-13 15:40:02 +01:00
Claudio Atzori 528231a287 grouping graph entities by id turned out to be an easy extension for the already existing cleaning workflow 2020-11-13 15:37:48 +01:00
Claudio Atzori 2bed29eb09 WIP: added oozie workflow for grouping graph entities by id 2020-11-13 10:05:12 +01:00
Claudio Atzori 13e36a4da0 WIP: added oozie workflow for grouping graph entities by id 2020-11-13 10:05:02 +01:00
Miriam Baglioni 0f1a4f6637 added collectedfrom information on record 2020-11-09 16:07:17 +01:00
Miriam Baglioni 0ef5e7dc34 fixed issue for authors with no name 2020-11-09 16:06:52 +01:00
Michele Artini 40160d171f organizations pids 2020-11-09 12:58:36 +01:00
Miriam Baglioni 902b0db85a try to make workflow and sub-workflow for making report and actual orcid cleaning 2020-11-06 17:19:28 +01:00
Sandro La Bruzzo 027ef2326c Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-11-06 17:12:42 +01:00
Sandro La Bruzzo cd27df91a1 fixed bug on missing relation in ANDS 2020-11-06 17:12:31 +01:00
Miriam Baglioni c56a43c90b - 2020-11-06 15:46:31 +01:00
Miriam Baglioni 863ce76820 merge branch with master 2020-11-06 15:30:19 +01:00