Commit Graph

536 Commits

Author SHA1 Message Date
Claudio Atzori fb22f4d70b included values for projects fundedamount and totalcost fields in the mapping tests. Swapped expected and actual values in junit test assertions 2020-09-24 12:10:59 +02:00
Claudio Atzori 9e3e93c6b6 setting the correct issn type in the datasource.journal element 2020-09-24 10:39:16 +02:00
Miriam Baglioni c2b5c780ff - 2020-09-14 14:34:03 +02:00
Miriam Baglioni 1f893e63dc - 2020-09-14 14:33:10 +02:00
Claudio Atzori 8a523474b7 code formatting 2020-09-07 11:40:16 +02:00
Miriam Baglioni b72a7dad46 resuorce for pid graph dump 2020-08-24 17:09:01 +02:00
Miriam Baglioni da103c399a resources for the pid graph dump test 2020-08-24 16:52:07 +02:00
Miriam Baglioni 630a6a1fe7 first tests for the pid graph dump 2020-08-24 16:51:26 +02:00
Miriam Baglioni 2c783793ba removed the affiliation from the author to mirror the changes in the model 2020-08-19 11:48:12 +02:00
Miriam Baglioni f6bf888016 removed affiliation from author to mirror the changes in the model 2020-08-19 11:41:41 +02:00
Miriam Baglioni 66d0e0d3f2 - 2020-08-19 11:31:50 +02:00
Miriam Baglioni d407852ac2 changed to reflect the changed in the model 2020-08-19 11:15:05 +02:00
Miriam Baglioni 47c21a8961 refactoring due to compilation 2020-08-19 11:11:57 +02:00
Miriam Baglioni 96600ed04a modified test resource for mirroring the deletion of affiliation from author parameters 2020-08-14 20:41:49 +02:00
Miriam Baglioni d2a8a4961a refactoring 2020-08-13 18:50:33 +02:00
Miriam Baglioni fd48ae3b85 changed because of D-Net/dnet-hadoop#40 (comment) 2020-08-13 12:19:15 +02:00
Miriam Baglioni 04a3e1ab38 disabled tests 2020-08-13 12:18:13 +02:00
Miriam Baglioni 2ede397933 Apply change because of D-Net/dnet-hadoop#40 (comment) 2020-08-13 12:16:39 +02:00
Miriam Baglioni adf9f96a67 test for extraction of relation between organizations and context 2020-08-12 10:04:47 +02:00
Miriam Baglioni 25f4fbceea draft of test and resources 2020-08-11 17:37:22 +02:00
Miriam Baglioni 30a2b19b65 changed metadata for deposition od covid-19 dump in Zenodo 2020-08-11 17:36:56 +02:00
Miriam Baglioni 49788b532a changed to mirror changes in the schema 2020-08-11 16:05:03 +02:00
Miriam Baglioni b08511287b - 2020-08-11 16:01:36 +02:00
Miriam Baglioni 7e81a17068 changed the XQUERY to mirror the change in the code 2020-08-11 16:00:33 +02:00
Miriam Baglioni 37ad2f28e9 removed added | in prefix for datasource 2020-08-11 15:55:06 +02:00
Miriam Baglioni f31c2e9461 enabled test 2020-08-11 15:49:25 +02:00
Miriam Baglioni 2d67476417 merge branch with master 2020-08-11 15:46:04 +02:00
Miriam Baglioni 6d3804e24c - 2020-08-11 15:45:12 +02:00
Miriam Baglioni 0603ec4757 changed test to upload the dump for covid-19 community 2020-08-11 15:43:25 +02:00
Miriam Baglioni 7dfd56df9d - 2020-08-11 15:42:35 +02:00
Miriam Baglioni a169d7e7c1 added test file for the MakeTar class 2020-08-11 15:40:41 +02:00
Miriam Baglioni c378c38546 disabled test. The testing functionalities for hte upload in Zenode are moved to common 2020-08-10 12:41:11 +02:00
Miriam Baglioni 63ad0ed209 changed to use communityMapPath instead of IsLookUp 2020-08-10 12:40:19 +02:00
Miriam Baglioni cec795f2ea changed resources to mirror changes in the model 2020-08-10 12:39:35 +02:00
Miriam Baglioni f50e3e7333 changed the class for which to generate the schema 2020-08-10 12:03:49 +02:00
Miriam Baglioni b8c26f656c test using communityMapPath instead of isLookUp 2020-08-10 12:02:55 +02:00
Sandro La Bruzzo 0ade33ad15 updated mergeFrom function for DLI Unknown 2020-08-10 10:18:35 +02:00
Miriam Baglioni 545ea9f77e moved in common. Zenodo response model and APIClient to deposit in Zenodo 2020-08-07 16:44:51 +02:00
Miriam Baglioni adf0ca5aa7 test to send is from hdfs 2020-08-05 14:24:43 +02:00
Alessia Bardi a29565ff57 code formatting 2020-08-04 12:55:27 +02:00
Alessia Bardi 09a323d18d testing a dataset from Nakala 2020-08-04 12:50:52 +02:00
Alessia Bardi c35bf486cc added handle among the possible PIDs 2020-08-04 12:50:12 +02:00
Miriam Baglioni 5b651abf82 merge branch with master 2020-08-04 10:14:07 +02:00
Miriam Baglioni fa38cdb10b added resource 2020-08-03 18:11:12 +02:00
Miriam Baglioni e9fcc0b2f1 commented test unit - to decide change for mirroring the changed logics 2020-08-03 18:10:53 +02:00
Miriam Baglioni c892c7dfa7 changed to query for community map just once and save the result for remaining executions 2020-08-03 17:56:31 +02:00
Alessia Bardi 8cc067fe76 specific test for claims 2020-08-03 11:17:50 +02:00
Michele Artini 652b13abb6 Merge branch 'master' into nsprefix_blacklist 2020-07-31 07:58:37 +02:00
Claudio Atzori cd631bb5bc defaults fixed in the cleaning workflow forces result.publisher to NULL when result.publisher.value in empty 2020-07-30 17:03:53 +02:00
Miriam Baglioni 872d7783fc - 2020-07-30 16:50:36 +02:00
Claudio Atzori 4ff8007518 added function to set the missing vocabulary names, used in the cleaning workflow as a pre-cleaning step 2020-07-30 16:24:39 +02:00
Michele Artini bdece15ca0 blacklist of nsprefix 2020-07-30 16:13:38 +02:00
Miriam Baglioni ee8420c6b3 added resource for datasource test 2020-07-29 18:28:43 +02:00
Miriam Baglioni ef1d8aef17 added one test to verify the dump for the datasources 2020-07-29 18:27:46 +02:00
Miriam Baglioni 1433db825d refactorign 2020-07-29 17:43:24 +02:00
Miriam Baglioni 8ad8dac7d4 merge branch with fork master 2020-07-29 17:38:28 +02:00
Miriam Baglioni 536e7f6352 added and changed resources for testing of the whole graph dump and of community related products dumps 2020-07-29 17:33:34 +02:00
Miriam Baglioni 4d7f590493 testings for the whole graph dump 2020-07-29 17:32:37 +02:00
Miriam Baglioni a2f73ec2c7 changed due to changes in the model 2020-07-29 17:32:02 +02:00
Miriam Baglioni 481585e9d3 - 2020-07-29 17:31:41 +02:00
Miriam Baglioni de2ebb467e changed due to changes in the model 2020-07-29 17:08:02 +02:00
Miriam Baglioni d0ff2a56fb - 2020-07-29 17:06:53 +02:00
Miriam Baglioni b96dedb56b changed due to changes in the model 2020-07-29 17:05:31 +02:00
Michele Artini 35e6e9c064 tests 2020-07-28 12:02:15 +02:00
Miriam Baglioni 332258d199 split the classes related to the communities dump and to the whole graph dump 2020-07-24 17:21:48 +02:00
Sandro La Bruzzo 9ab594ccf6 fixed test 2020-07-21 10:36:21 +02:00
Miriam Baglioni 40bbe94f7c merge with master fork 2020-07-20 18:10:03 +02:00
Miriam Baglioni 3aab7680f6 changed the test results 2020-07-20 18:00:43 +02:00
Miriam Baglioni 5076e4f320 changed test to comply with the modifications 2020-07-20 17:55:18 +02:00
Claudio Atzori 54ac583923 code formatting 2020-07-20 17:37:08 +02:00
Claudio Atzori 124e7ce19c in case of missing attribute //dr:CobjCategory/@type the resulttype is derived by looking up the vocabulary dnet:result_typologies with the 1st instance type available 2020-07-20 17:33:37 +02:00
Claudio Atzori 050dda223d Merge pull request 'removed duplicated fields' (#25) from unique_field_in_lists into master
Looks good as a temporary workaround. I agree the model could seamlessly make the distinct operation by using HashSets instead of Linked (or Array) Lists.

The task to update the model in such a way is added on #9#issuecomment-1583

Thanks!
2020-07-20 12:12:50 +02:00
Michele Artini 331a3cbdd0 fixed originalId 2020-07-20 09:50:29 +02:00
Michele Artini 442f30930c removed duplicated fields 2020-07-17 12:25:36 +02:00
Michele Artini 3adedd0a68 trust truncated to 3 decimals 2020-07-17 11:58:11 +02:00
Sandro La Bruzzo c01efed79b Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-07-10 14:44:57 +02:00
Sandro La Bruzzo a7d3977481 added generation of EBI Dataset 2020-07-10 14:44:50 +02:00
Claudio Atzori 67e1d222b6 bulk cleaning when found null or empty, sets bestaccessrights evaluating the result instances 2020-07-08 17:53:35 +02:00
Alessia Bardi 9a898c0e4c Json schema generator 2020-07-08 12:52:00 +02:00
Miriam Baglioni 7fe00cb4fb - 2020-07-08 10:29:37 +02:00
Miriam Baglioni 375ef07d7b changed the description for the upload 2020-07-07 18:41:27 +02:00
Miriam Baglioni 817cddfc52 - 2020-07-07 18:25:12 +02:00
Miriam Baglioni a66aa9bd83 removed unuseful tests 2020-07-07 18:25:00 +02:00
Miriam Baglioni 9b20a21b24 removed unuseful tests 2020-07-07 18:23:37 +02:00
Miriam Baglioni 0208bc18f3 added new resource for testing 2020-07-07 17:47:24 +02:00
Miriam Baglioni f5bb65c9ef the json schema for the dump of the results 2020-07-07 17:34:40 +02:00
Miriam Baglioni f8bf4acd76 - 2020-07-02 16:03:11 +02:00
Miriam Baglioni e6c79d44e6 - 2020-07-02 16:02:02 +02:00
Miriam Baglioni 94500a581b merge branch with fork master 2020-07-02 14:25:39 +02:00
Sandro La Bruzzo 1d420eedb4 added generation of EBI Dataset 2020-07-02 12:37:43 +02:00
Miriam Baglioni 3e5570de7a - 2020-06-23 15:44:54 +02:00
Michele Artini 38bb45d0b6 test osf:refereed 2020-06-23 10:14:39 +02:00
Miriam Baglioni e4b21be004 - 2020-06-22 17:31:50 +02:00
Miriam Baglioni df80ae5c1b merge branch with fork master 2020-06-22 10:51:23 +02:00
Miriam Baglioni e8f914f8b3 - 2020-06-22 10:50:41 +02:00
Claudio Atzori d0ac7514b2 cleaning workflow to include cleaning of default values 2020-06-18 19:37:25 +02:00
Miriam Baglioni fb80353018 - 2020-06-18 14:21:36 +02:00
Miriam Baglioni 65bf312360 merge branch with fork master 2020-06-18 11:35:27 +02:00
Miriam Baglioni a118b66858 - 2020-06-18 11:34:30 +02:00
Miriam Baglioni 8b145e6aba - 2020-06-18 11:25:28 +02:00
Miriam Baglioni 5c8533d1a1 changed in the testing classes 2020-06-18 11:20:08 +02:00
Miriam Baglioni bc8611a95a added new resources for testing 2020-06-18 11:19:20 +02:00
Claudio Atzori 1bc1d15eaf stubbing for mock datasource.identities must be typed as array 2020-06-16 16:54:28 +02:00
Claudio Atzori 89859111ee Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-06-16 15:28:29 +02:00
Michele Artini 8a4f84f8c0 refactoring 2020-06-16 12:34:13 +02:00
Claudio Atzori 2a4f65795f WIP: graph cleaner implementation 2020-06-15 18:32:24 +02:00
Miriam Baglioni 9dd3ef22c5 merge branch with fork master 2020-06-15 11:23:26 +02:00
Miriam Baglioni 68cf0fd03f test input 2020-06-15 11:14:42 +02:00
Miriam Baglioni 0467145ae3 test for graph dump 2020-06-15 11:13:51 +02:00
Miriam Baglioni 20b9e67728 added new class funder 2020-06-15 11:06:18 +02:00
Claudio Atzori 0d52816244 WIP: graph cleaner implementation 2020-06-13 13:06:04 +02:00
Claudio Atzori 463489f59f code formatting 2020-06-12 12:03:25 +02:00
Claudio Atzori 97b1c4057c WIP: graph cleaner implementation 2020-06-12 10:45:18 +02:00
Miriam Baglioni e145972962 - 2020-06-11 13:08:39 +02:00
Miriam Baglioni 356dd582a3 map construction moved in class 2020-06-11 12:59:22 +02:00
Miriam Baglioni db27663750 - 2020-06-11 10:49:01 +02:00
Miriam Baglioni bb9f21d0e7 job test for class producing first step of results dump 2020-06-11 10:20:05 +02:00
Claudio Atzori 953da4a427 Merge branch 'master' into graph_cleaning 2020-06-10 21:36:56 +02:00
Michele Artini 7177a32d75 import of invisible stores 2020-06-10 10:04:00 +02:00
Claudio Atzori a2fdf85ba1 WIP: graph cleaner implementation 2020-06-09 19:52:53 +02:00
Claudio Atzori d9f33582c5 WIP: graph cleaner implementation 2020-06-09 17:20:40 +02:00
Michele Artini adb798faa5 import from db using is vocabularies 2020-05-29 12:03:51 +02:00
Michele Artini 3ceb2d2853 match terms with vocabularies 2020-05-27 11:34:13 +02:00
Claudio Atzori de108f54d6 code formatting 2020-05-23 10:21:19 +02:00
Claudio Atzori 6b56cae57d added mapping for bestaccessrights 2020-05-23 09:57:39 +02:00
Michele Artini dc4621b3cb filter ORCID e MAG identifiers 2020-05-22 12:25:01 +02:00
Michele Artini 9f2d0f1b08 filter ORCID e MAG identifiers 2020-05-22 11:00:27 +02:00
Claudio Atzori 7a89507ab1 code formatting 2020-05-15 15:16:54 +02:00
Claudio Atzori cfc8948717 fixed mapping OdfToGraph: pick the correct element to map author pids and author affiliations; extended mapping Oaf2Graph: added support for author pids 2020-05-15 12:26:16 +02:00
Miriam Baglioni 638a3c465b - 2020-04-30 11:05:17 +02:00
Miriam Baglioni f7695e833c resolved conflicts 2020-04-29 11:41:31 +02:00
Claudio Atzori 6f5b899038 reformatted code according to the updated style descriptor 2020-04-28 11:23:29 +02:00
Claudio Atzori a0bdbacdae switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:52:31 +02:00
Claudio Atzori 7a3f8085f7 switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin 2020-04-27 14:45:40 +02:00
Miriam Baglioni c093d764a3 - 2020-04-27 11:12:38 +02:00
Claudio Atzori 48157e0fc4 GraphHiveImporterJob moved in dedicate package 2020-04-24 14:32:28 +02:00
Michele Artini 072eae3803 fixed a problem with missing contexts 2020-04-23 16:42:49 +02:00
Michele Artini d920ce501e fixed a problem with missing instances 2020-04-23 16:18:40 +02:00
Claudio Atzori d772d967aa restored changes from master branch 2020-04-20 18:53:06 +02:00
miconis 4da13e4570 Revert "Merge branch 'master' into deduptesting"
This reverts commit 772f75d167, reversing
changes made to 5f45f2c77f.
2020-04-20 16:04:49 +02:00
Claudio Atzori d714bfb4d4 collectedfrom field moved in common parent class Oaf.java 2020-04-20 12:25:19 +02:00
Michele Artini 8ff7facfa3 fixed collectedFrom ID 2020-04-20 11:09:27 +02:00
Michele Artini d2058fdc47 tests 2020-04-20 09:31:14 +02:00
Michele Artini 478a958f09 tests 2020-04-20 09:15:27 +02:00
Claudio Atzori ad7a131b18 introduced common project code formatting plugin, works on the commit hook, based on https://github.com/Cosium/git-code-format-maven-plugin, applied to each java class in the project 2020-04-18 12:42:58 +02:00
Claudio Atzori 6b5f9ca9cb raw graph creation workflow moved under dhp-graph-mapper, claims integration is included 2020-04-10 17:53:07 +02:00
Claudio Atzori 47f3d9b757 unit test for GraphHiveImporterJob 2020-04-08 13:24:43 +02:00
Sandro La Bruzzo a4b6a51168 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-03-27 13:48:56 +01:00
Sandro La Bruzzo 15d9106b3f FIxed merge of dhp dedup 2020-03-27 13:48:44 +01:00
Claudio Atzori e196fff212 adjusted path for source resource in unit test 2020-03-27 13:45:10 +01:00
Sandro La Bruzzo 8c9a56a0c8 refactored package name 2020-03-27 13:19:33 +01:00
Sandro La Bruzzo a9935f80d4 refactor class name and workflow name for graph mapper, added javadoc 2020-03-27 13:16:24 +01:00
Claudio Atzori 673e744649 moved openaire specific implementations under dedicated package eu.dnetlib.dhp.oa 2020-03-27 10:42:17 +01:00
Claudio Atzori 098fabab3f reorganizing content under dhp-workflows/dhp-graph-mapper 2020-03-26 19:44:19 +01:00
Claudio Atzori 77c4294924 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-03-26 18:26:52 +01:00
Claudio Atzori 43cbcda7ef unit test for SparkGraphImporterJob 2020-03-26 18:26:40 +01:00
Sandro La Bruzzo 0cd022ad6a merge with master 2020-03-26 14:08:29 +01:00
Claudio Atzori abcd3f5bf5 added sample data for unit tests 2020-03-26 11:12:52 +01:00
Claudio Atzori 9dff4adbc3 dhp-graph-mapper workflow tests upgraded to junit5 2020-03-25 18:25:12 +01:00
Michele Artini ebe45003d9 fixed some junit packages 2020-03-25 16:45:03 +01:00
Sandro La Bruzzo addaaa091f migrate relation from RDD to Dataset 2020-03-13 09:13:20 +01:00
Sandro La Bruzzo 2b8675462f refactoring code 2020-02-19 10:07:08 +01:00
Sandro La Bruzzo 19a80e4638 implemented workfow for aggregation and generation of infospace graph 2020-01-24 09:58:55 +01:00
Sandro La Bruzzo aad0cb40b7 Added schema Scholexplorer 2019-11-14 10:34:09 +01:00
Claudio Atzori 1e7a2ac41d align parmeter names, graph import procedure WIP 2019-11-04 17:41:01 +01:00
Claudio Atzori 439ad80d81 conversion utilities from protobuffer model to DHP model moved in dnet-mapreduce-jobs. Removed also the relative protobuf dependencies 2019-11-04 12:33:23 +01:00
Claudio Atzori 32ed4ae8d6 conversion utilities from protobuffer model to DHP model moved in dnet-mapreduce-jobs. Removed also the relative protobuf dependencies 2019-11-04 12:28:56 +01:00
Sandro La Bruzzo 18ec8e8147 moved protoutils function to dhp-schemas 2019-10-31 11:31:37 +01:00
Sandro La Bruzzo 997e57d45b Added entity filter to spark class 2019-10-30 12:19:03 +01:00
Sandro La Bruzzo fe62ccd6dd implemented oozie wf 2019-10-28 12:12:50 +01:00
Sandro La Bruzzo 9ee4e5a196 remove a bit of syntactic sugar on the object inheritance :( 2019-10-25 18:10:30 +02:00
Sandro La Bruzzo c74335ebc7 resolved conflict 2019-10-25 14:34:50 +02:00
Sandro La Bruzzo 8c902c500a minor fix 2019-10-25 14:33:54 +02:00
miconis 9fa5aebe9c minor changes 2019-10-25 12:52:28 +02:00
miconis 551eda1600 dataset, orp and software mapping implemented. addition of test resources for results. implementation of tests to check the result of the mapping 2019-10-25 12:48:25 +02:00
Sandro La Bruzzo eef14fade3 fixed conflict 2019-10-25 11:58:20 +02:00
Sandro La Bruzzo 0ea7e861ab added organizations test 2019-10-25 11:56:28 +02:00
miconis 4908165e05 implementation of the createPublication method to map publications 2019-10-25 11:54:14 +02:00
miconis df37bd6aaf placeholders for setters in createpublication 2019-10-25 10:57:19 +02:00
miconis b525b54130 starting implementing the createPublication class 2019-10-25 09:55:31 +02:00
Sandro La Bruzzo 09ffda03a2 removed circular dependencies 2019-10-25 09:24:18 +02:00
miconis 351d850ad3 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2019-10-24 17:29:07 +02:00
miconis b66a7e3030 publication test added 2019-10-24 17:29:01 +02:00
Sandro La Bruzzo 6c32d418ac added conversion of ExtraInfo 2019-10-24 17:26:55 +02:00
Sandro La Bruzzo d2965636e0 created test for convert json into new OAF data model 2019-10-24 17:02:35 +02:00
Sandro La Bruzzo 5744a64478 added module dhp=graph-mapper 2019-10-24 16:00:28 +02:00