Commit Graph

289 Commits

Author SHA1 Message Date
Miriam Baglioni e47ea9349c extended some types by adding provenance as the couple (provenance, trust) and moved some classes to be used by the complete graph dump also 2020-07-20 17:46:27 +02:00
Claudio Atzori 32f5e466e3 imports cleanup 2020-07-20 17:42:58 +02:00
Sandro La Bruzzo a7d3977481 added generation of EBI Dataset 2020-07-10 14:44:50 +02:00
Miriam Baglioni 250fd1c854 merge branch with fork master 2020-06-22 16:25:48 +02:00
Claudio Atzori 9cd27183b6 [maven-release-plugin] prepare for next development iteration 2020-06-22 11:27:44 +02:00
Claudio Atzori 1e3dab0631 [maven-release-plugin] prepare release dhp-1.2.3 2020-06-22 11:27:39 +02:00
Miriam Baglioni df80ae5c1b merge branch with fork master 2020-06-22 10:51:23 +02:00
Claudio Atzori 7d416f08d8 graph cleaning workflow: set hostedby to unknown repository when defined as NULL 2020-06-22 09:50:43 +02:00
Miriam Baglioni 65bf312360 merge branch with fork master 2020-06-18 11:35:27 +02:00
Miriam Baglioni 8211cbb9fe extension of Result to contain all the properties owned by any result type 2020-06-18 11:23:52 +02:00
Miriam Baglioni bc8611a95a added new resources for testing 2020-06-18 11:19:20 +02:00
Sandro La Bruzzo 9bf67f5de1 resolved conflicts 2020-06-17 09:15:43 +02:00
Sandro La Bruzzo 1d4275acc4 implemented first version of exportation of Scholexplorer into ActionSet 2020-06-17 09:10:38 +02:00
miconis 5233b15265 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-06-16 18:31:19 +02:00
miconis 11b77b9f4e json dumps for entity merge test modified to fit the new model. title merge adjusted to fix the error 2020-06-16 18:31:11 +02:00
Claudio Atzori 306669209f code formatting 2020-06-16 16:54:44 +02:00
Claudio Atzori 603b1bd0bb Merge branch 'master' into dhp_oaf_model 2020-06-16 15:43:59 +02:00
Miriam Baglioni 9dd3ef22c5 merge branch with fork master 2020-06-15 11:23:26 +02:00
Miriam Baglioni 56e70573c2 - 2020-06-15 11:06:56 +02:00
Miriam Baglioni 20b9e67728 added new class funder 2020-06-15 11:06:18 +02:00
Claudio Atzori c4d9f1837f [maven-release-plugin] prepare for next development iteration 2020-06-12 12:21:08 +02:00
Claudio Atzori f0746a7605 [maven-release-plugin] prepare release dhp-1.2.2 2020-06-12 12:21:03 +02:00
Claudio Atzori 463489f59f code formatting 2020-06-12 12:03:25 +02:00
Claudio Atzori 4bcad1c9c3 Merge branch 'graph_cleaning' 2020-06-12 11:40:25 +02:00
Alessia Bardi ed8879ed8b deprecate PUBLICATION_DATASET 2020-06-12 10:55:56 +02:00
Alessia Bardi 3ade2631b3 Constants for new rels: citations and reviews 2020-06-12 10:52:12 +02:00
Claudio Atzori ba8a024af9 avoid NPEs merging titles 2020-06-12 10:45:11 +02:00
Claudio Atzori a2fdf85ba1 WIP: graph cleaner implementation 2020-06-09 19:52:53 +02:00
Miriam Baglioni 206abba48c merge branch with fork master 2020-06-09 15:41:14 +02:00
Miriam Baglioni 5121cbaf6a new classes for external dump. Only classes functional to dump products 2020-06-09 15:37:46 +02:00
Miriam Baglioni f232db84e9 new classes for external dump. Only classes functional to dump products 2020-06-08 15:11:37 +02:00
Claudio Atzori 25a093b1a4 integrated changes from master 2020-06-08 15:04:00 +02:00
Claudio Atzori 45973b5743 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-06-08 15:01:34 +02:00
Claudio Atzori 94533b71bc added comments for model fields removal 2020-06-08 15:01:21 +02:00
Claudio Atzori b2f9564f13 WIP: fixed PrepareRelationsJob; parallel implementation of CreateRelatedEntitiesJob_phase2, now works by OafType; introduced custom aggregator in AdjacencyListBuilderJob 2020-05-29 10:58:15 +02:00
Sandro La Bruzzo b87b3ddb6b changed mapping ORCIDToOAF 2020-05-29 09:32:04 +02:00
Sandro La Bruzzo 7d29b61c62 code refactor 2020-05-28 09:57:46 +02:00
Miriam Baglioni dd1e0b93b8 added merge for Programme 2020-05-27 17:40:32 +02:00
Miriam Baglioni f3dcca0dd0 added equals for programme 2020-05-27 17:23:34 +02:00
Miriam Baglioni 92e3a52e91 merge branch with fork master 2020-05-26 15:57:51 +02:00
Claudio Atzori 7b288a94cb code formatting 2020-05-26 09:54:13 +02:00
Claudio Atzori 7582532e73 [maven-release-plugin] prepare for next development iteration 2020-05-25 19:48:18 +02:00
Claudio Atzori 01c2e93395 [maven-release-plugin] prepare release dhp-1.2.1 2020-05-25 19:48:14 +02:00
Claudio Atzori ae04234472 DataInfo.deletedbyinference is false by default 2020-05-25 19:32:48 +02:00
miconis da1e5cf557 implementation of the result title merge. main title with higher trust, distinct between the others 2020-05-25 18:02:57 +02:00
Claudio Atzori 4b34872b44 using Objects.equals to check Field<T> equivalence 2020-05-25 10:14:15 +02:00
Claudio Atzori 0ab0206b4d removed null objects from flattened Field<T> in mergeLists 2020-05-25 10:11:41 +02:00
Claudio Atzori de108f54d6 code formatting 2020-05-23 10:21:19 +02:00
Claudio Atzori 6b56cae57d added mapping for bestaccessrights 2020-05-23 09:57:39 +02:00
Miriam Baglioni 24daa1deaa added to the Project class a new field that is the list of programmes 2020-05-20 10:28:16 +02:00
Miriam Baglioni d323100af0 added the new Programme POJO. It contains the code and the description of the programme 2020-05-20 10:27:27 +02:00
Miriam Baglioni 22cb9e0da7 simple code to get file from URL 2020-05-15 18:18:01 +02:00
Miriam Baglioni 3aaad753fd Merge branch 'master' into dhp_oaf_model 2020-05-15 15:55:23 +02:00
Claudio Atzori b7e198475a added common methods to create HiveDB table identifiers 2020-05-15 10:20:07 +02:00
Miriam Baglioni 42085e8d99 added some constants 2020-05-14 18:22:28 +02:00
Claudio Atzori c6b028f2af code formatting 2020-05-11 17:38:08 +02:00
Claudio Atzori 637653cba3 integrated changes from master 2020-05-11 14:05:25 +02:00
Miriam Baglioni 2abb84877d Merge branch 'master' into blacklist 2020-05-11 10:37:49 +02:00
Miriam Baglioni bb59bdd60f merge upstream 2020-05-11 10:33:17 +02:00
Miriam Baglioni 871e079b45 merged with master 2020-05-11 10:20:00 +02:00
Claudio Atzori 60c40618d3 [maven-release-plugin] prepare for next development iteration 2020-05-11 10:17:14 +02:00
Claudio Atzori c267d958d5 [maven-release-plugin] prepare release dhp-1.2.0 2020-05-11 10:17:10 +02:00
Miriam Baglioni 391b2399cc merge upstream 2020-05-11 10:08:51 +02:00
Claudio Atzori 42f1a2bf94 bumped project version to 1.2.0-SNAPSHOT 2020-05-11 10:05:57 +02:00
Miriam Baglioni 32301451ec merge upstream 2020-05-11 09:42:23 +02:00
Claudio Atzori 0ccc864ad9 [maven-release-plugin] prepare for next development iteration 2020-05-08 17:01:31 +02:00
Claudio Atzori 6e47c724c6 [maven-release-plugin] prepare release dhp-1.1.7 2020-05-08 17:01:27 +02:00
Miriam Baglioni 28556507e7 - 2020-05-08 12:54:52 +02:00
Miriam Baglioni 4c94231cad merge with master fork 2020-05-08 12:25:57 +02:00
Claudio Atzori 62ea19f1d3 introduced mapping for ExternalReferences, made urls defined within an instance unique 2020-05-08 09:43:26 +02:00
Miriam Baglioni 182225becb Merge branch 'master' of https://code-repo.d4science.org/miriam.baglioni/dnet-hadoop 2020-05-07 11:38:17 +02:00
Miriam Baglioni 5efae3acb9 new workflow for job3 2020-05-07 11:38:10 +02:00
Claudio Atzori 128c3bf1c8 restored Author bean with simple getter/setter, author pid addition moved into dedicated implementation SparkOrcidToResultFromSemRelJob3 2020-05-07 11:14:56 +02:00
Claudio Atzori 17860d3ab6 general changes in the RAW graph mapping: missing collectedfrom/hostedby causes records to be skipped; factored out most of the constants in ModelConstants class (dhp-schemas) 2020-05-06 13:20:02 +02:00
Claudio Atzori 405f495d54 code formatting 2020-05-04 19:18:12 +02:00
Claudio Atzori c54d7ca18c example measures in serialization test 2020-05-04 17:02:40 +02:00
Claudio Atzori 11938dac5e this commit adds: validated/validationDate to relationships; measure type and simple unit test to indicate the relative serialization 2020-05-04 16:47:07 +02:00
Claudio Atzori 24d8d097b6 sync with master branch 2020-05-04 16:44:13 +02:00
Claudio Atzori de5fbe325c bits of javadoc 2020-05-04 16:00:48 +02:00
Miriam Baglioni 4b0bd91012 - 2020-04-30 12:45:28 +02:00
Miriam Baglioni 3abb76ff7a merge with upstream 2020-04-30 11:15:54 +02:00
Miriam Baglioni 638a3c465b - 2020-04-30 11:05:17 +02:00
Miriam Baglioni 564e5d6279 added new information in support of blacklist reader 2020-04-30 10:22:58 +02:00
Claudio Atzori 439c6255a2 cleanup 2020-04-29 19:09:07 +02:00
Claudio Atzori 77ac995770 cleaned up poms, added descriptions 2020-04-29 18:44:17 +02:00
Miriam Baglioni 869f576273 added hash map for relationship entityType id prefix, and relation inverse 2020-04-29 18:14:52 +02:00
Miriam Baglioni b85ad7012a reads the blacklist from the blacklist db and writes it as a set of relations on hdfs 2020-04-29 17:29:49 +02:00
Miriam Baglioni f7695e833c resolved conflicts 2020-04-29 11:41:31 +02:00
Claudio Atzori 6f5b899038 reformatted code according to the updated style descriptor 2020-04-28 11:23:29 +02:00
Claudio Atzori a0bdbacdae switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:52:31 +02:00
Claudio Atzori 7a3f8085f7 switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:45:40 +02:00
Miriam Baglioni 5dccbe13db merge with upstream 2020-04-27 10:43:59 +02:00
Claudio Atzori 268462623a refined definition of equals and hash methods for Oaf model classes, now based on entity identifier, while relations consider sourceid, targetid and relationship semantic; Factored out function to group Oaf objects in grouping operations; Raw graph creation procedure merges entities and relationships providing the same identity 2020-04-24 14:42:01 +02:00
Claudio Atzori 5100527400 added default value for resulttype field 2020-04-23 19:14:37 +02:00
Miriam Baglioni 04fc223346 add method addPid 2020-04-23 11:07:44 +02:00
Miriam Baglioni 259525cb93 Merge remote-tracking branch 'upstream/master' 2020-04-21 18:33:46 +02:00
Claudio Atzori d772d967aa restored changes from master branch 2020-04-20 18:53:06 +02:00
miconis 4da13e4570 Revert "Merge branch 'master' into deduptesting"
This reverts commit 772f75d167, reversing
changes made to 5f45f2c77f.
2020-04-20 16:04:49 +02:00
Claudio Atzori d714bfb4d4 collectedfrom field moved in common parent class Oaf.java 2020-04-20 12:25:19 +02:00
Miriam Baglioni 454b8a6a29 Merge remote-tracking branch 'upstream/master' 2020-04-18 14:09:44 +02:00
Claudio Atzori ad7a131b18 introduced common project code formatting plugin, works on the commit hook, based on https://github.com/Cosium/git-code-format-maven-plugin, applied to each java class in the project 2020-04-18 12:42:58 +02:00
Miriam Baglioni 7d9fd75020 add method addPid 2020-04-17 17:13:48 +02:00
Sandro La Bruzzo 5e2fa996aa fixed problem with conversion of long into string 2020-04-17 12:11:51 +02:00
Sandro La Bruzzo c36239e693 fixed incremental indexing 2020-04-14 17:47:36 +02:00
Claudio Atzori cc67dbff81 typo in text 2020-04-14 17:11:55 +02:00
Claudio Atzori 8b2043c7b1 introducing List<KeyValue> generic container for Relation specific properties. Ref ticket https://issue.openaire.research-infrastructures.eu/issues/5512 2020-04-14 16:43:40 +02:00
Claudio Atzori d74e128aa6 Utility classes moved in dhp-common and dhp-schemas 2020-04-07 11:56:22 +02:00
Claudio Atzori c57cf679ca Merge branch 'provision_dataset' 2020-04-07 08:56:58 +02:00
Claudio Atzori 3d1b637cab dataset based provision WIP 2020-04-04 14:03:43 +02:00
Przemysław Jacewicz 51ff3b4e81 [dhp-schemas] added safeguard against casting exception in mergeFrom methods and null-safe handling of collectedfrom collection for relation 2020-04-01 18:28:23 +02:00
przemek 9d1d18d4b9 Merge branch 'master' into przemyslawjacewicz_actionmanager_impl_prototype 2020-03-31 12:04:58 +02:00
Claudio Atzori 377e1ba840 [maven-release-plugin] prepare for next development iteration 2020-03-30 20:06:00 +02:00
Claudio Atzori 76d9315129 [maven-release-plugin] prepare release dhp-1.1.6 2020-03-30 20:05:56 +02:00
Sandro La Bruzzo 0cd022ad6a merge with master 2020-03-26 14:08:29 +01:00
Claudio Atzori 4753662edd removed unnecessary dependency 2020-03-26 09:03:43 +01:00
Claudio Atzori 3e8f6981c4 dhp-schemas tests upgraded to junit5 2020-03-25 17:38:58 +01:00
Claudio Atzori 23668d4a6a WIP adopting junit5 2020-03-25 16:49:45 +01:00
Claudio Atzori a226198a13 WIP adopting junit5 2020-03-25 16:47:39 +01:00
Michele Artini ebe45003d9 fixed some junit packages 2020-03-25 16:45:03 +01:00
Michele Artini d9bfdcd607 updated poms 2020-03-25 16:31:12 +01:00
przemek 638b78f96a Merge remote-tracking branch 'origin/master' into przemyslawjacewicz_actionmanager_impl_prototype 2020-03-19 15:12:56 +01:00
Claudio Atzori 1850a02ae4 added simpler, AtomicAction replacement, based on the dhp.Oaf model 2020-03-19 10:44:16 +01:00
Claudio Atzori 23a929177d updates to the graph require this to be an actual class 2020-03-13 14:56:35 +01:00
Sandro La Bruzzo addaaa091f migrate relation from RDD to Dataset 2020-03-13 09:13:20 +01:00
Przemysław Jacewicz f7454a9ed8 Added equals and hashCode for OAF types 2020-03-11 16:57:28 +01:00
Michele Artini 4c94e74a84 Added a missing dependency 2020-02-20 11:43:32 +01:00
Claudio Atzori d42dde52ba implemented method to merge relations 2020-02-19 17:29:05 +01:00
Claudio Atzori 5bae30f399 adding readme for dhp-schema 2020-02-17 13:38:33 +01:00
Claudio Atzori 1ee1baa8c0 Merge branch 'master' into provision_indexing 2020-02-13 18:17:07 +01:00
Claudio Atzori a3d0b57b25 [maven-release-plugin] prepare for next development iteration 2020-02-13 18:11:33 +01:00
Claudio Atzori 6ed9a15bc8 [maven-release-plugin] prepare release dhp-1.1.5 2020-02-13 18:11:31 +01:00
Claudio Atzori 49e648f7c3 bumped version 2020-02-13 18:09:31 +01:00
Claudio Atzori 11cfd6bd9a integrated changes from master branch 2020-02-13 17:27:07 +01:00
Claudio Atzori bbf1b611b9 refereed, processingchargeamount and processingchargecurrency moved inside the Instance element. Introduced specific type to model Result's countries 2020-02-13 17:21:11 +01:00
Claudio Atzori d3b96f102b builder pattern screws up the Parquet schema inference method, avoid using it in the bean definitions 2020-02-04 14:10:58 +01:00
Claudio Atzori ed290ca8d7 builder pattern 2020-02-03 10:35:51 +01:00
Claudio Atzori 1ecca69f49 added annotation to ignore method during the serialization 2020-01-30 17:45:28 +01:00
Sandro La Bruzzo 19a80e4638 implemented workfow for aggregation and generation of infospace graph 2020-01-24 09:58:55 +01:00
Claudio Atzori 799929c1e3 joining entities using T x R x S method with groupByKey 2020-01-21 16:35:44 +01:00
Sandro La Bruzzo fa7504bf29 removed DLI stuff should be in a branch 2020-01-20 10:28:00 +01:00
Claudio Atzori 1cd6899480 merged from master 2020-01-17 14:25:57 +01:00
Claudio Atzori 749b0660ab instance URLs must be repeatable 2020-01-17 14:22:15 +01:00
Claudio Atzori 63c0db4ff8 instance URLs must be repeatable 2020-01-16 15:54:53 +02:00
Claudio Atzori 97c239ee0d WIP: trying to find a way to build the records for the index 2020-01-16 12:02:28 +02:00
Sandro La Bruzzo b4392f9f43 implemented DedupRecord factory for missing entities 2019-12-13 09:40:02 +01:00
miconis 545e940007 implementation of the mergeFrom for the Datasources 2019-12-12 15:36:41 +01:00
Sandro La Bruzzo 39367676d7 implemented DedupRecord factory with the merge of project 2019-12-12 15:18:48 +01:00
Sandro La Bruzzo 6b45e37e22 implemented DedupRecord factory with the merge of organizations 2019-12-11 16:57:37 +01:00
Sandro La Bruzzo abd9034da0 implemented DedupRecord factory with the merge of publications 2019-12-11 15:43:24 +01:00
miconis 4b66b471a4 implementation of the sorting by trust mechanism and the merge of oaf entities 2019-12-10 14:57:16 +01:00