Commit Graph

43 Commits

Author SHA1 Message Date
Claudio Atzori cd631bb5bc defaults fixed in the cleaning workflow forces result.publisher to NULL when result.publisher.value in empty 2020-07-30 17:03:53 +02:00
Claudio Atzori 4ff8007518 added function to set the missing vocabulary names, used in the cleaning workflow as a pre-cleaning step 2020-07-30 16:24:39 +02:00
Michele Artini 35e6e9c064 tests 2020-07-28 12:02:15 +02:00
Claudio Atzori 124e7ce19c in case of missing attribute //dr:CobjCategory/@type the resulttype is derived by looking up the vocabulary dnet:result_typologies with the 1st instance type available 2020-07-20 17:33:37 +02:00
Claudio Atzori 050dda223d Merge pull request 'removed duplicated fields' (#25) from unique_field_in_lists into master
Looks good as a temporary workaround. I agree the model could seamlessly make the distinct operation by using HashSets instead of Linked (or Array) Lists.

The task to update the model in such a way is added on #9#issuecomment-1583

Thanks!
2020-07-20 12:12:50 +02:00
Michele Artini 442f30930c removed duplicated fields 2020-07-17 12:25:36 +02:00
Sandro La Bruzzo c01efed79b Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-07-10 14:44:57 +02:00
Sandro La Bruzzo a7d3977481 added generation of EBI Dataset 2020-07-10 14:44:50 +02:00
Claudio Atzori 67e1d222b6 bulk cleaning when found null or empty, sets bestaccessrights evaluating the result instances 2020-07-08 17:53:35 +02:00
Michele Artini 38bb45d0b6 test osf:refereed 2020-06-23 10:14:39 +02:00
Claudio Atzori d0ac7514b2 cleaning workflow to include cleaning of default values 2020-06-18 19:37:25 +02:00
Claudio Atzori 1bc1d15eaf stubbing for mock datasource.identities must be typed as array 2020-06-16 16:54:28 +02:00
Claudio Atzori 2a4f65795f WIP: graph cleaner implementation 2020-06-15 18:32:24 +02:00
Claudio Atzori 97b1c4057c WIP: graph cleaner implementation 2020-06-12 10:45:18 +02:00
Claudio Atzori a2fdf85ba1 WIP: graph cleaner implementation 2020-06-09 19:52:53 +02:00
Claudio Atzori d9f33582c5 WIP: graph cleaner implementation 2020-06-09 17:20:40 +02:00
Michele Artini adb798faa5 import from db using is vocabularies 2020-05-29 12:03:51 +02:00
Claudio Atzori cfc8948717 fixed mapping OdfToGraph: pick the correct element to map author pids and author affiliations; extended mapping Oaf2Graph: added support for author pids 2020-05-15 12:26:16 +02:00
Miriam Baglioni c093d764a3 - 2020-04-27 11:12:38 +02:00
Claudio Atzori d772d967aa restored changes from master branch 2020-04-20 18:53:06 +02:00
miconis 4da13e4570 Revert "Merge branch 'master' into deduptesting"
This reverts commit 772f75d167, reversing
changes made to 5f45f2c77f.
2020-04-20 16:04:49 +02:00
Claudio Atzori d714bfb4d4 collectedfrom field moved in common parent class Oaf.java 2020-04-20 12:25:19 +02:00
Michele Artini 478a958f09 tests 2020-04-20 09:15:27 +02:00
Claudio Atzori 6b5f9ca9cb raw graph creation workflow moved under dhp-graph-mapper, claims integration is included 2020-04-10 17:53:07 +02:00
Sandro La Bruzzo 8c9a56a0c8 refactored package name 2020-03-27 13:19:33 +01:00
Sandro La Bruzzo a9935f80d4 refactor class name and workflow name for graph mapper, added javadoc 2020-03-27 13:16:24 +01:00
Claudio Atzori 673e744649 moved openaire specific implementations under dedicated package eu.dnetlib.dhp.oa 2020-03-27 10:42:17 +01:00
Claudio Atzori 77c4294924 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-03-26 18:26:52 +01:00
Claudio Atzori 43cbcda7ef unit test for SparkGraphImporterJob 2020-03-26 18:26:40 +01:00
Sandro La Bruzzo 0cd022ad6a merge with master 2020-03-26 14:08:29 +01:00
Claudio Atzori abcd3f5bf5 added sample data for unit tests 2020-03-26 11:12:52 +01:00
Claudio Atzori 9dff4adbc3 dhp-graph-mapper workflow tests upgraded to junit5 2020-03-25 18:25:12 +01:00
Sandro La Bruzzo addaaa091f migrate relation from RDD to Dataset 2020-03-13 09:13:20 +01:00
Sandro La Bruzzo 2b8675462f refactoring code 2020-02-19 10:07:08 +01:00
Claudio Atzori 32ed4ae8d6 conversion utilities from protobuffer model to DHP model moved in dnet-mapreduce-jobs. Removed also the relative protobuf dependencies 2019-11-04 12:28:56 +01:00
miconis 9fa5aebe9c minor changes 2019-10-25 12:52:28 +02:00
miconis 551eda1600 dataset, orp and software mapping implemented. addition of test resources for results. implementation of tests to check the result of the mapping 2019-10-25 12:48:25 +02:00
Sandro La Bruzzo eef14fade3 fixed conflict 2019-10-25 11:58:20 +02:00
Sandro La Bruzzo 0ea7e861ab added organizations test 2019-10-25 11:56:28 +02:00
miconis 351d850ad3 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2019-10-24 17:29:07 +02:00
miconis b66a7e3030 publication test added 2019-10-24 17:29:01 +02:00
Sandro La Bruzzo 6c32d418ac added conversion of ExtraInfo 2019-10-24 17:26:55 +02:00
Sandro La Bruzzo d2965636e0 created test for convert json into new OAF data model 2019-10-24 17:02:35 +02:00