Commit Graph

2994 Commits

Author SHA1 Message Date
Claudio Atzori 59a250337c [graph resolution] drop output path at the beginning 2022-01-24 18:02:39 +01:00
Claudio Atzori 8de9788308 applied fix for avoiding ruling out the invisible (APC) records during the graph cleaning 2022-01-24 11:29:22 +01:00
Claudio Atzori f2fde5566b using helper method from ModelSupport to find the inverse relation descriptor 2022-01-20 09:19:07 +01:00
Claudio Atzori 9acc32faa6 [stats wf] final touches for the integration of PRs #166, #179 in the master branch 2022-01-12 12:04:31 +01:00
dimitrispie b053b0178e Sprint 5 and other changes 2022-01-12 11:23:37 +01:00
Antonis Lempesis b6b4bc0df9 added first indicator of sprint 5 2022-01-12 11:20:28 +01:00
Antonis Lempesis e91f06f39b fixed typos in indicators. Added extra views in monitor 2022-01-12 11:18:28 +01:00
Antonis Lempesis 3ce1976627 fixed column names 2022-01-12 11:14:41 +01:00
Antonis Lempesis 4878d7485c added usage stats 2022-01-12 11:13:25 +01:00
Antonis Lempesis a4316bafed fixed a typo 2022-01-12 11:12:53 +01:00
Antonis Lempesis bb17e070d8 added result_result relations 2022-01-12 11:09:38 +01:00
Claudio Atzori a30a98a716 Applying PR#166 in the master branch (Added sprint 3&4 of indicators). Merge commit '0df9574a6f5d9d75bc840decb023561ae941f9d6' 2022-01-12 10:57:19 +01:00
Claudio Atzori 8ae46ca789 OAF-store-graph mdstores: firther fix for PR#180 2022-01-05 15:52:15 +01:00
Claudio Atzori 3bd3653be9 OAF-store-graph mdstores: save them in text format 2022-01-04 16:39:39 +01:00
Claudio Atzori 3dc48c7ab5 OAF-store-graph mdstores: save them in text format 2022-01-04 16:39:27 +01:00
Claudio Atzori f82db765db OAF-store-graph mdstores: save them in text format 2022-01-04 16:39:15 +01:00
Claudio Atzori 8d13effa31 test for the tolerant deserialisation utility method 2022-01-04 16:38:26 +01:00
Claudio Atzori 9458ee7938 serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema 2022-01-04 16:38:09 +01:00
Antonis Lempesis f0b523cfa7 removed the too restrctive clause. will discuss again 2021-12-15 12:32:15 +01:00
Claudio Atzori c1b6ae47cd cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies 2021-12-09 16:47:41 +01:00
Claudio Atzori cd9c51fd7a vocabulary based cleaning considers also the term label when looking up for a synonym 2021-12-09 14:49:24 +01:00
Claudio Atzori 0df9574a6f Merge pull request '[stats wf] Added sprint 3&4 of indicators' (#166) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #166
2021-11-29 10:40:26 +01:00
Claudio Atzori 1de881b796 resolved conflicts for #165 2021-11-26 16:15:11 +01:00
Claudio Atzori 014e872ae1 [resolution wf] added optional parameter to skip the entity resolution 2021-11-26 15:38:56 +01:00
Claudio Atzori 5c6d328537 code formatting 2021-11-26 15:38:16 +01:00
dimitrispie 09fc2afdca Added indi_funder_country_collab
Kept only indi_pub_has_cc_licence
2021-11-26 16:13:10 +02:00
Antonis Lempesis 0b4163ee0b added sprint3,4, removed 2, chaos 2021-11-26 15:58:01 +02:00
dimitrispie 29f69f2f89 Sprint 4 2021-11-26 15:22:04 +02:00
Miriam Baglioni ac07ed8251 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2021-11-25 12:32:58 +01:00
Miriam Baglioni 5fd0e610bf [DOIBOOST Process] fix filtering to filter results with non null id 2021-11-25 12:10:45 +01:00
Sandro La Bruzzo feea154e89 remove working dir after test 2021-11-25 11:02:38 +01:00
Sandro La Bruzzo 028a8acad8 add test resources 2021-11-25 10:54:47 +01:00
Sandro La Bruzzo 2164a2a889 Datacite: Code Refactor generated a general SparkApplication Scala where all the spark scala have to inherit
Commented a little the Datacite transformation code
2021-11-25 10:54:13 +01:00
Miriam Baglioni 3f9b2ba8ce [Hosted By Map] fix issue in test 2021-11-22 16:59:43 +01:00
Sandro La Bruzzo a7cf277d98 Datacite: Removed HostedBy Patch as described on ticket #7219, Now all the records will have hosted by Unknown Repository 2021-11-22 16:03:17 +01:00
Sandro La Bruzzo 483d3039d1 entity resolution: added distcpt of missing entities in graph materialization 2021-11-22 15:55:24 +01:00
Sandro La Bruzzo 93fe8ce8b2 entity resolution: fix test 2021-11-22 15:50:43 +01:00
Sandro La Bruzzo 35e20b0647 updated resolution wf:
- generate a new version of the graph
 - changed merge from union to join
2021-11-22 11:48:55 +01:00
Miriam Baglioni fdb75b180e [Cleaning] added couple of tests for DOIBOOST publications 2021-11-21 16:35:22 +01:00
Sandro La Bruzzo 3426451d3f Merge remote-tracking branch 'origin/beta' into beta 2021-11-19 14:49:04 +01:00
Sandro La Bruzzo 4542a2338b updated site configuration to deploy on website 2021-11-19 13:44:08 +01:00
Claudio Atzori e5a2c596b2 Merge branch 'beta' into preserve_openorg_parent_child_relations 2021-11-19 11:35:46 +01:00
Claudio Atzori f4538f3c4c cleanup 2021-11-19 11:33:10 +01:00
Claudio Atzori 2b46b87f56 fixed filtering criteria applied in SparkCopyRelationsNoOpenorgs to keep the parent/child relations from OpenOrgs 2021-11-19 11:30:29 +01:00
Sandro La Bruzzo fc03c99805 fixed javadocs url after deploying site 2021-11-19 10:46:33 +01:00
Sandro La Bruzzo 0c0d561bc4 added public class into tests to create correct javadoc 2021-11-19 09:54:22 +01:00
Claudio Atzori 62fa61f3cf merge from beta 2021-11-19 09:23:42 +01:00
Claudio Atzori bd9a43cefd Revert to 4094f2bb9a 2021-11-19 09:20:43 +01:00
Claudio Atzori a24b9f8268 [dedup] trivial refactoring 2021-11-18 17:12:02 +01:00
Claudio Atzori c0750fb17c avoid non necessary count operations over large spark datasets 2021-11-18 17:11:31 +01:00