Commit Graph

285 Commits

Author SHA1 Message Date
Claudio Atzori 2c407e775e GenerateEntitiesApplication can be configured to hash the id value or not 2020-11-30 12:00:38 +01:00
Miriam Baglioni 124591a7f3 refactoring 2020-11-25 18:23:28 +01:00
Miriam Baglioni 1a89f8211c #61 (comment) 2020-11-25 18:12:40 +01:00
Miriam Baglioni d4ddde2ef2 changed because of #61 (comment) 2020-11-25 18:01:01 +01:00
Miriam Baglioni 90d4369fd2 added test to verify the compression in writing community info on hdfs 2020-11-25 14:34:58 +01:00
Miriam Baglioni 1f130cdf92 changed the relation (produces -> isProducedBy) due to the change in the code 2020-11-25 14:04:26 +01:00
Miriam Baglioni 305e3d0c9c added resource file for relation with relClass = isProducedBy 2020-11-25 13:43:41 +01:00
Miriam Baglioni bde6d337dd test classes for dump of results related to funders 2020-11-25 13:42:01 +01:00
Miriam Baglioni b37b9352d7 added constant value for semantic relationship between projects and results 2020-11-25 13:41:08 +01:00
Claudio Atzori e1a1bb3ee4 moved class CleaningFunctions in the correct package. Remove newlines from titles, descriptions, subjects 2020-11-24 18:34:03 +01:00
Miriam Baglioni 54a309bb6b refactoring 2020-11-24 14:45:30 +01:00
Miriam Baglioni 35ecea8842 changed to consider the modification for the specification of the type of dump 2020-11-24 14:45:15 +01:00
Claudio Atzori 3f34757c63 merged from master 2020-11-19 14:34:54 +01:00
Claudio Atzori ede7fae6c8 Merge pull request 'XML record indexing test' (#58) from provision_indexing into master 2020-11-18 17:04:34 +01:00
Claudio Atzori 8177ce7939 test for XmlIndexingJob based on a local miniSolrCluster 2020-11-18 10:58:05 +01:00
Michele Artini 33da2e3d6c xpaths for dateOfCollection and dateOfTransformation 2020-11-18 09:26:20 +01:00
Alessia Bardi 7e0a76a8ac test fr TextGrid 2020-11-17 18:39:25 +01:00
Claudio Atzori 331d621800 added test resource 2020-11-14 12:16:15 +01:00
Claudio Atzori 768bc5304c Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-11-13 15:40:34 +01:00
Claudio Atzori 93f7b7974f Merge pull request 'trust truncated to 3 decimals' (#24) from trunc_trust into master
LGTM
2020-11-13 15:40:02 +01:00
Claudio Atzori 2bed29eb09 WIP: added oozie workflow for grouping graph entities by id 2020-11-13 10:05:12 +01:00
Claudio Atzori 13e36a4da0 WIP: added oozie workflow for grouping graph entities by id 2020-11-13 10:05:02 +01:00
Claudio Atzori 9b0fb9e958 merged from master 2020-11-12 09:27:12 +01:00
Sandro La Bruzzo 027ef2326c Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-11-06 17:12:42 +01:00
Sandro La Bruzzo cd27df91a1 fixed bug on missing relation in ANDS 2020-11-06 17:12:31 +01:00
Claudio Atzori d10447e747 re-packaged graph dump workflow sources 2020-11-05 17:38:18 +01:00
Miriam Baglioni 56150d7e5e removed all code related to the dump of pids graph 2020-11-04 17:13:12 +01:00
Miriam Baglioni c010a8442f fixed issue on test code 2020-11-03 17:26:51 +01:00
Miriam Baglioni 8ec7a61188 merge branch with master 2020-11-03 16:59:08 +01:00
Claudio Atzori 86d6fbe95b refactoring: CleaningFunctions and OafMapperUtils moved in dhp-commong 2020-11-03 12:19:46 +01:00
Claudio Atzori 3fcd669e99 result merge operation leverage on custom ResultTypeComparator in the aggregator graph construction 2020-11-03 10:53:23 +01:00
Claudio Atzori 09e44dabff Merge branch 'master' into stable_ids 2020-11-02 12:16:01 +01:00
Sandro La Bruzzo 754c86f33e fixed test to work on jenkins 2020-11-02 09:35:01 +01:00
Claudio Atzori 4ca75d6951 Merge pull request 'Dedup ID creation policy' (#48) from deduptesting into stable_ids 2020-10-30 15:15:32 +01:00
Miriam Baglioni 3241ec1777 added connection timeout and socket timeout 600 sec 2020-10-27 16:12:11 +01:00
Alessia Bardi 1425d810a8 testing mapping 2020-10-19 17:46:14 +02:00
Claudio Atzori 266bf1a221 common IdentifierFactory in use on the mapping from the aggregator data; merge the entities sharing the same id; code formatting 2020-10-16 17:02:10 +02:00
Claudio Atzori 34f1d0904b common IdentifierFactory in use on the mapping from the aggregator data 2020-10-16 16:00:19 +02:00
Sandro La Bruzzo fed711da80 Merge remote-tracking branch 'origin/master' into merge_record_to_common 2020-10-13 15:32:45 +02:00
Alessia Bardi 8775a64bc1 Merge pull request 'Merging different compatibility levels (pinocchio operator)' (#47) from merge_graph into master 2020-10-09 14:44:52 +02:00
Sandro La Bruzzo fe0a7870e6 Added test to check if merge authors works 2020-10-08 10:33:12 +02:00
Sandro La Bruzzo cd9c377d18 adpted scholexplorer Dump generation to the new Dataset definition 2020-10-08 10:10:13 +02:00
Claudio Atzori 8d85a2fced [BETA wf only] datasources involved in the merge operation doesn't obey to the infra precedence policy, but relies on a custom behaviour that, given two datasources from beta and prod returns the one from prod with the highest compatibility among the two 2020-10-07 16:28:52 +02:00
Claudio Atzori c2a6e2a9bf fixed mapping for datasource journal info (ISSNs) 2020-10-02 09:37:08 +02:00
Miriam Baglioni fcaedac980 merge branch with master 2020-10-01 16:46:59 +02:00
Claudio Atzori 2e9e13444d author pids made unique by value 2020-10-01 12:50:40 +02:00
Claudio Atzori e265c3e125 cleaning functions factored out in a dedicated class 2020-10-01 10:50:15 +02:00
Miriam Baglioni de6c4d46d8 fixed conflicts 2020-09-24 15:35:01 +02:00
Claudio Atzori 27df1cea6d code formatting 2020-09-24 12:16:00 +02:00
Claudio Atzori fb22f4d70b included values for projects fundedamount and totalcost fields in the mapping tests. Swapped expected and actual values in junit test assertions 2020-09-24 12:10:59 +02:00
Claudio Atzori 9e3e93c6b6 setting the correct issn type in the datasource.journal element 2020-09-24 10:39:16 +02:00
Miriam Baglioni c2b5c780ff - 2020-09-14 14:34:03 +02:00
Miriam Baglioni 1f893e63dc - 2020-09-14 14:33:10 +02:00
Claudio Atzori 8a523474b7 code formatting 2020-09-07 11:40:16 +02:00
Miriam Baglioni b72a7dad46 resuorce for pid graph dump 2020-08-24 17:09:01 +02:00
Miriam Baglioni da103c399a resources for the pid graph dump test 2020-08-24 16:52:07 +02:00
Miriam Baglioni 630a6a1fe7 first tests for the pid graph dump 2020-08-24 16:51:26 +02:00
Miriam Baglioni 2c783793ba removed the affiliation from the author to mirror the changes in the model 2020-08-19 11:48:12 +02:00
Miriam Baglioni f6bf888016 removed affiliation from author to mirror the changes in the model 2020-08-19 11:41:41 +02:00
Miriam Baglioni 66d0e0d3f2 - 2020-08-19 11:31:50 +02:00
Miriam Baglioni d407852ac2 changed to reflect the changed in the model 2020-08-19 11:15:05 +02:00
Miriam Baglioni 47c21a8961 refactoring due to compilation 2020-08-19 11:11:57 +02:00
Miriam Baglioni 96600ed04a modified test resource for mirroring the deletion of affiliation from author parameters 2020-08-14 20:41:49 +02:00
Miriam Baglioni d2a8a4961a refactoring 2020-08-13 18:50:33 +02:00
Miriam Baglioni fd48ae3b85 changed because of #40 (comment) 2020-08-13 12:19:15 +02:00
Miriam Baglioni 04a3e1ab38 disabled tests 2020-08-13 12:18:13 +02:00
Miriam Baglioni 2ede397933 Apply change because of #40 (comment) 2020-08-13 12:16:39 +02:00
Miriam Baglioni adf9f96a67 test for extraction of relation between organizations and context 2020-08-12 10:04:47 +02:00
Miriam Baglioni 25f4fbceea draft of test and resources 2020-08-11 17:37:22 +02:00
Miriam Baglioni 30a2b19b65 changed metadata for deposition od covid-19 dump in Zenodo 2020-08-11 17:36:56 +02:00
Miriam Baglioni 49788b532a changed to mirror changes in the schema 2020-08-11 16:05:03 +02:00
Miriam Baglioni b08511287b - 2020-08-11 16:01:36 +02:00
Miriam Baglioni 7e81a17068 changed the XQUERY to mirror the change in the code 2020-08-11 16:00:33 +02:00
Miriam Baglioni 37ad2f28e9 removed added | in prefix for datasource 2020-08-11 15:55:06 +02:00
Miriam Baglioni f31c2e9461 enabled test 2020-08-11 15:49:25 +02:00
Miriam Baglioni 2d67476417 merge branch with master 2020-08-11 15:46:04 +02:00
Miriam Baglioni 6d3804e24c - 2020-08-11 15:45:12 +02:00
Miriam Baglioni 0603ec4757 changed test to upload the dump for covid-19 community 2020-08-11 15:43:25 +02:00
Miriam Baglioni 7dfd56df9d - 2020-08-11 15:42:35 +02:00
Miriam Baglioni a169d7e7c1 added test file for the MakeTar class 2020-08-11 15:40:41 +02:00
Miriam Baglioni c378c38546 disabled test. The testing functionalities for hte upload in Zenode are moved to common 2020-08-10 12:41:11 +02:00
Miriam Baglioni 63ad0ed209 changed to use communityMapPath instead of IsLookUp 2020-08-10 12:40:19 +02:00
Miriam Baglioni cec795f2ea changed resources to mirror changes in the model 2020-08-10 12:39:35 +02:00
Miriam Baglioni f50e3e7333 changed the class for which to generate the schema 2020-08-10 12:03:49 +02:00
Miriam Baglioni b8c26f656c test using communityMapPath instead of isLookUp 2020-08-10 12:02:55 +02:00
Sandro La Bruzzo 0ade33ad15 updated mergeFrom function for DLI Unknown 2020-08-10 10:18:35 +02:00
Miriam Baglioni 545ea9f77e moved in common. Zenodo response model and APIClient to deposit in Zenodo 2020-08-07 16:44:51 +02:00
Miriam Baglioni adf0ca5aa7 test to send is from hdfs 2020-08-05 14:24:43 +02:00
Alessia Bardi a29565ff57 code formatting 2020-08-04 12:55:27 +02:00
Alessia Bardi 09a323d18d testing a dataset from Nakala 2020-08-04 12:50:52 +02:00
Alessia Bardi c35bf486cc added handle among the possible PIDs 2020-08-04 12:50:12 +02:00
Miriam Baglioni 5b651abf82 merge branch with master 2020-08-04 10:14:07 +02:00
Miriam Baglioni fa38cdb10b added resource 2020-08-03 18:11:12 +02:00
Miriam Baglioni e9fcc0b2f1 commented test unit - to decide change for mirroring the changed logics 2020-08-03 18:10:53 +02:00
Miriam Baglioni c892c7dfa7 changed to query for community map just once and save the result for remaining executions 2020-08-03 17:56:31 +02:00
Alessia Bardi 8cc067fe76 specific test for claims 2020-08-03 11:17:50 +02:00
Michele Artini 652b13abb6 Merge branch 'master' into nsprefix_blacklist 2020-07-31 07:58:37 +02:00
Claudio Atzori cd631bb5bc defaults fixed in the cleaning workflow forces result.publisher to NULL when result.publisher.value in empty 2020-07-30 17:03:53 +02:00
Miriam Baglioni 872d7783fc - 2020-07-30 16:50:36 +02:00
Claudio Atzori 4ff8007518 added function to set the missing vocabulary names, used in the cleaning workflow as a pre-cleaning step 2020-07-30 16:24:39 +02:00