Commit Graph

1739 Commits

Author SHA1 Message Date
Miriam Baglioni 25cbcf6114 changed to solve issues about names. context renamed communities_infrastructure.json and removed the double json.gz extention to the name of the part in the tar 2020-10-02 12:17:46 +02:00
Claudio Atzori 9db0f88fb8 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-10-02 09:43:35 +02:00
Claudio Atzori 49ae3450a9 code formatting 2020-10-02 09:43:24 +02:00
Claudio Atzori c2a6e2a9bf fixed mapping for datasource journal info (ISSNs) 2020-10-02 09:37:08 +02:00
Miriam Baglioni 01117a46e1 whole workflow activated 2020-10-01 17:19:21 +02:00
Miriam Baglioni cfb5766c6b removed double json.gz from names of files in the tar 2020-10-01 17:18:34 +02:00
Miriam Baglioni fcaedac980 merge branch with master 2020-10-01 16:46:59 +02:00
Miriam Baglioni c6e6ed1bd8 merge branch with master 2020-10-01 16:24:41 +02:00
Miriam Baglioni 4aec347351 refactoring 2020-10-01 16:23:52 +02:00
Miriam Baglioni 61946b4092 refactoring 2020-10-01 16:22:48 +02:00
Miriam Baglioni 7e6d35e56c added the link to the excel file related to topic 2020-10-01 15:53:31 +02:00
Sandro La Bruzzo 1a0a44e85a Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-10-01 15:46:53 +02:00
Sandro La Bruzzo c4a3c52e45 fixed Doiboost bug in the identifier 2020-10-01 15:46:44 +02:00
Miriam Baglioni 43cbd62c2b added classpath.first in the configuration 2020-10-01 15:46:34 +02:00
Miriam Baglioni cd69c6b023 added dependency for the topic file path 2020-10-01 15:45:59 +02:00
Miriam Baglioni 771cde3d05 moved the library version to global pom 2020-10-01 15:43:47 +02:00
Miriam Baglioni 632351c0da modified test resources to mirror the changed in the code 2020-10-01 15:43:02 +02:00
Miriam Baglioni ebc1c5513f modified test resources to mirror the changed in the code 2020-10-01 15:42:29 +02:00
Miriam Baglioni 3a374c34b6 fixed null pointer exception 2020-10-01 15:41:01 +02:00
Miriam Baglioni 83ea746163 added check to the test 2020-10-01 15:40:28 +02:00
Claudio Atzori 2e9e13444d author pids made unique by value 2020-10-01 12:50:40 +02:00
Miriam Baglioni 6e5db85b32 - 2020-10-01 11:51:11 +02:00
Miriam Baglioni a46179f61c refactoring 2020-10-01 11:22:01 +02:00
Miriam Baglioni b90bee124b removing raws that are empy from thos imported 2020-10-01 11:16:49 +02:00
Miriam Baglioni c107f193c9 refactoring 2020-10-01 11:16:22 +02:00
Claudio Atzori e265c3e125 cleaning functions factored out in a dedicated class 2020-10-01 10:50:15 +02:00
Miriam Baglioni 706a80a29a added test to check that separator '-' (not hyphen) will be recognized 2020-10-01 10:38:31 +02:00
Miriam Baglioni 3dca586b3b refactoring 2020-10-01 10:34:48 +02:00
Miriam Baglioni 416bda6066 changed the programme.desxcription by using the same value used in the classification instead of the short title or the title 2020-10-01 10:31:33 +02:00
Miriam Baglioni f6587c91f3 added comparison to a char that seems - but it is not 2020-10-01 10:30:26 +02:00
Claudio Atzori 4287164aba include relevantdate field in the result view 2020-10-01 10:28:55 +02:00
Miriam Baglioni 7e73bb88b3 changed the logic to add the topic description to the project 2020-09-28 17:21:43 +02:00
Miriam Baglioni 0a035e3630 - 2020-09-28 17:20:49 +02:00
Miriam Baglioni 16bee2084d added the topic code to the project subset 2020-09-28 17:20:11 +02:00
Miriam Baglioni 0bf2d0db52 added to the workflow the download of the topic excel file and one property needed to get the input path of the topic file in the hdfs filesystem 2020-09-28 12:17:22 +02:00
Miriam Baglioni c2abde4d9f changed the implementation of Atomic Actions creation by exploiting the topic information get from the cordis excel file 2020-09-28 12:16:34 +02:00
Miriam Baglioni d930b8d3fc changed the query to get only the code of the project and not the optional1 (topic code) and optional2 (topic description) 2020-09-28 12:15:48 +02:00
Miriam Baglioni f8f5cfd5cc removed the part added to set the topic code and description in the step of project preparation 2020-09-28 12:13:33 +02:00
Miriam Baglioni 9e19c9a221 remove the topic description from the values in the CSVProject class 2020-09-28 12:11:03 +02:00
Miriam Baglioni 6d8b932e40 refactoring 2020-09-28 12:06:56 +02:00
Miriam Baglioni b77f166549 changed the package name from csvutils to utils 2020-09-28 12:05:47 +02:00
Miriam Baglioni e33e3277de added needed dependency to read the excel file 2020-09-28 12:03:14 +02:00
Miriam Baglioni f4739a371a code to get the information related to the topic association between code and description. 2020-09-28 12:02:48 +02:00
Miriam Baglioni 7b6a7333e6 merge branch with master 2020-09-25 16:42:07 +02:00
Miriam Baglioni 983a12ed15 temporary modification to allow the upload of files in the sandbox without the neew to recreate the mapping from scratch 2020-09-25 16:41:51 +02:00
Miriam Baglioni 8b36d19182 added property depositionId and chenage property newVersion that became string from boolean to handle the three possible distinct values 2020-09-25 16:41:15 +02:00
Miriam Baglioni ed5239f9ec added new code to handle the new possibility to upload files to an already open deposition 2020-09-25 16:34:32 +02:00
Miriam Baglioni 3a8c524fce refactor 2020-09-25 16:34:02 +02:00
Miriam Baglioni 2ac2b537b6 merge branch with master 2020-09-25 14:40:47 +02:00
Miriam Baglioni 54800fb9b0 enabled only the step to upload in zenodo 2020-09-25 14:40:22 +02:00
Miriam Baglioni 12c2dfc268 modified the resource to consider the information added to the model 2020-09-25 14:17:23 +02:00
Miriam Baglioni 969fa8d96e fixed issue and changed the transformation of the programme file to consider the new model 2020-09-25 13:32:34 +02:00
Michele Artini c171fdebe1 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-09-25 09:03:09 +02:00
Michele Artini c96598aaa4 opendoar partition 2020-09-25 09:02:58 +02:00
Miriam Baglioni de6c4d46d8 fixed conflicts 2020-09-24 15:35:01 +02:00
Miriam Baglioni e917281822 - 2020-09-24 15:24:05 +02:00
Miriam Baglioni 9f54f69e6d added topic information 2020-09-24 15:23:35 +02:00
Miriam Baglioni d6206d6e63 add the topic description to the action set associated to the project 2020-09-24 15:22:40 +02:00
Miriam Baglioni 6b50226f3b added topic code and topic description 2020-09-24 15:21:49 +02:00
Miriam Baglioni 15af1f527e modified to consider the topic information 2020-09-24 15:20:56 +02:00
Miriam Baglioni 609ff17cfc now the commission give us the framework programme (FP7 - H2020) so use this information to filter out programmes not associated to H2020 2020-09-24 15:19:31 +02:00
Miriam Baglioni b66f930466 Added optionl1 and optional2 information to the files red from the db. Optional1 contains the topic code and optional2 contains the topic description 2020-09-24 15:16:56 +02:00
Miriam Baglioni 860e6d38a6 added topic description to the CSV project variables 2020-09-24 15:15:26 +02:00
Claudio Atzori 044d3a0214 fixed query used to load datasources in the Graph 2020-09-24 13:48:58 +02:00
Claudio Atzori 27df1cea6d code formatting 2020-09-24 12:16:00 +02:00
Claudio Atzori fb22f4d70b included values for projects fundedamount and totalcost fields in the mapping tests. Swapped expected and actual values in junit test assertions 2020-09-24 12:10:59 +02:00
Claudio Atzori 42f55395c8 fixed order of the ISSNs returned by the SQL query 2020-09-24 12:09:58 +02:00
Claudio Atzori fadf5c7c69 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-09-24 10:42:52 +02:00
Claudio Atzori 9a7e72d528 using concat_ws to join textual columns from PSQL. When using || to perform the concatenation, Null columns makes the operation result to be Null 2020-09-24 10:42:47 +02:00
Claudio Atzori 9e3e93c6b6 setting the correct issn type in the datasource.journal element 2020-09-24 10:39:16 +02:00
Miriam Baglioni 0d83f47166 merge branch with master 2020-09-23 17:33:49 +02:00
Miriam Baglioni 39eb8ab25b changed the dump to move from h2020programme to h2020classification 2020-09-23 17:33:00 +02:00
Miriam Baglioni 1d84cf19a6 added new line to resource file 2020-09-23 17:32:22 +02:00
Miriam Baglioni f0c476b6c9 modification to the test classes to consider h2020classification 2020-09-23 17:31:49 +02:00
Miriam Baglioni 2cba3cb484 modification to the classes building the actionset to consider the h2020classification 2020-09-23 17:31:15 +02:00
Miriam Baglioni 1069cf243a modification to the schema to consider the H2020classification of the programme. The filed Programme has been moved inside the H2020classification that is now associated to the Project. Programme is no more associated directly to the Project but via H2020CLassification 2020-09-22 14:38:00 +02:00
Michele Artini 9e681609fd stats to sql file 2020-09-17 15:51:22 +02:00
Michele Artini 51321c2701 partition of events by opedoarId 2020-09-17 11:38:07 +02:00
Claudio Atzori cf2ce1a09b code formatting 2020-09-15 15:58:03 +02:00
Miriam Baglioni c2b5c780ff - 2020-09-14 14:34:03 +02:00
Miriam Baglioni e2ceefe9be - 2020-09-14 14:33:28 +02:00
Miriam Baglioni 1f893e63dc - 2020-09-14 14:33:10 +02:00
Michele Artini 9b0c12f5d3 send notifications 2020-09-11 12:06:16 +02:00
Michele Artini 028613b751 remove old notifications 2020-09-09 15:32:06 +02:00
Michele Artini 9cfc124ac5 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-09-08 16:39:54 +02:00
Michele Artini a597a218ab * forall topics 2020-09-08 16:39:40 +02:00
Claudio Atzori 8a523474b7 code formatting 2020-09-07 11:40:16 +02:00
Michele Artini bb459caf69 support for all topic subscriptions 2020-08-27 11:01:21 +02:00
Michele Artini 82ed8edafd notification indexing 2020-08-26 15:10:48 +02:00
Miriam Baglioni b72a7dad46 resuorce for pid graph dump 2020-08-24 17:09:01 +02:00
Miriam Baglioni 8694bb9b31 refactoring due to compilation 2020-08-24 17:07:34 +02:00
Miriam Baglioni 8a069a4fea - 2020-08-24 17:01:30 +02:00
Miriam Baglioni 34fa96f3b1 - 2020-08-24 17:00:20 +02:00
Miriam Baglioni 5fb2949cb8 added utils methods 2020-08-24 17:00:09 +02:00
Miriam Baglioni 2a540b6c01 added constants for the pid graph dump 2020-08-24 16:55:35 +02:00
Miriam Baglioni da103c399a resources for the pid graph dump test 2020-08-24 16:52:07 +02:00
Miriam Baglioni 630a6a1fe7 first tests for the pid graph dump 2020-08-24 16:51:26 +02:00
Miriam Baglioni 40c8d2de7b test resources for the dump of the pids graph 2020-08-24 16:50:39 +02:00
Miriam Baglioni bef79d3bdf first attempt to the dump of pids graph 2020-08-24 16:49:38 +02:00
Michele Artini da470422d3 deleting events 2020-08-21 14:52:48 +02:00
Michele Artini 6e60bf026a indexing only a subset of eventsa 2020-08-19 12:39:22 +02:00
Miriam Baglioni 85203c16e3 merge branch with master 2020-08-19 11:49:03 +02:00
Miriam Baglioni 2c783793ba removed the affiliation from the author to mirror the changes in the model 2020-08-19 11:48:12 +02:00
Miriam Baglioni f6bf888016 removed affiliation from author to mirror the changes in the model 2020-08-19 11:41:41 +02:00
Miriam Baglioni 66d0e0d3f2 - 2020-08-19 11:31:50 +02:00
Miriam Baglioni 1c593a9cfe - 2020-08-19 11:29:51 +02:00
Miriam Baglioni e42b2f5ae2 - 2020-08-19 11:29:09 +02:00
Miriam Baglioni f81ee22418 changed to mirror the changes in the model (Instance, CommunityInstance, GraphResult) 2020-08-19 11:28:26 +02:00
Miriam Baglioni 387be43fd4 changed to discriminate if dumping all the results type together or each one in its own archive 2020-08-19 11:25:27 +02:00
Miriam Baglioni c5858afb88 added parameter to guide the dump for the result (resultAggregation). true if all the result types should be dump together, false otherwise. 2020-08-19 11:24:14 +02:00
Miriam Baglioni d407852ac2 changed to reflect the changed in the model 2020-08-19 11:15:05 +02:00
Miriam Baglioni 47c21a8961 refactoring due to compilation 2020-08-19 11:11:57 +02:00
Miriam Baglioni 5570678c65 changed parameter name from hfdsNameNode to nameNode 2020-08-19 10:59:26 +02:00
Miriam Baglioni dc5096a327 refactoring due to compilation 2020-08-19 10:57:36 +02:00
Miriam Baglioni 55e24c2547 relclass for relation and corresponding values have been put to lower case (isSupplementedBy wrote as IsSupplementedBy - orcid propagation) 2020-08-18 16:42:08 +02:00
Miriam Baglioni f44dd5d886 changed in mapping the result semantic name as it will be visible il the relclass Relation: from IsSupplementedBy to isSupplementedBy 2020-08-17 17:15:09 +02:00
Miriam Baglioni bc6b5d5b34 removed leftover parameter 2020-08-15 11:22:35 +02:00
Miriam Baglioni 200cd5c730 removed leftover parameter 2020-08-15 11:22:19 +02:00
Miriam Baglioni 96600ed04a modified test resource for mirroring the deletion of affiliation from author parameters 2020-08-14 20:41:49 +02:00
Miriam Baglioni 09f5b92763 added specific reference to class 2020-08-14 20:00:09 +02:00
Miriam Baglioni 37e7c43652 changed parameter name from hdfsNaemNode to nameNode 2020-08-14 18:18:25 +02:00
Claudio Atzori 5b994d7ccf Merge branch 'dump' of https://code-repo.d4science.org/miriam.baglioni/dnet-hadoop into resolve_conflicts_pr40_dump 2020-08-14 15:32:29 +02:00
Miriam Baglioni de995970ea try again to solve clash with master 2020-08-14 15:24:36 +02:00
Miriam Baglioni 5040d72d5e changed to make it equal to master branch 2020-08-14 15:20:17 +02:00
Miriam Baglioni be8106c339 added space toavoid conflicts with master branch 2020-08-14 15:16:27 +02:00
Claudio Atzori 1871d1c6f6 solve error java.lang.NoSuchFieldError: INSTANCE when instantiating Solr client 2020-08-14 11:18:30 +02:00
Miriam Baglioni d2a8a4961a refactoring 2020-08-13 18:50:33 +02:00
Miriam Baglioni a5043de5da added method to get the mapped instance 2020-08-13 18:45:50 +02:00
Miriam Baglioni b7e49aee8d removed commented code 2020-08-13 18:44:07 +02:00
Miriam Baglioni f439a6231e added missing constraint in XQuery (verify the status of the RC/RI different from hidden) 2020-08-13 15:30:55 +02:00
Miriam Baglioni 0fe800b1c9 modified because of #40\#issuecomment-1902 2020-08-13 15:17:12 +02:00
Miriam Baglioni 270c89489c fixed issue created while renaming subject to subjects in community configuration xml 2020-08-13 15:16:04 +02:00
Miriam Baglioni fcd10f452c changed because of #40 (comment) 2020-08-13 12:55:32 +02:00
Miriam Baglioni fd48ae3b85 changed because of #40 (comment) 2020-08-13 12:19:15 +02:00
Miriam Baglioni 04a3e1ab38 disabled tests 2020-08-13 12:18:13 +02:00
Miriam Baglioni 2ede397933 Apply change because of #40 (comment) 2020-08-13 12:16:39 +02:00
Miriam Baglioni bfd1fcde6d removed not useful method and changed because of #40 (comment) and #40 (comment) 2020-08-13 12:14:37 +02:00
Miriam Baglioni 7fd8397123 apply changes in #40 (comment) 2020-08-13 12:13:15 +02:00
Miriam Baglioni 753d448cc9 apply changes in #40 (comment) 2020-08-13 12:12:58 +02:00
Miriam Baglioni c0e071fa26 apply changes in #40 (comment) 2020-08-13 12:12:40 +02:00
Miriam Baglioni 526db915bc apply changes in #40 (comment) 2020-08-13 12:12:16 +02:00
Miriam Baglioni b0fab0d138 apply changes in #40 (comment) 2020-08-13 12:11:57 +02:00
Miriam Baglioni 1b6320b251 apply changes in #40 (comment) 2020-08-13 12:11:41 +02:00
Miriam Baglioni 743d31be22 apply changes in #40 (comment) 2020-08-13 12:11:22 +02:00
Miriam Baglioni 65b48df652 apply changes in #40 (comment) 2020-08-13 12:11:06 +02:00
Miriam Baglioni 90b54d3efb apply changes in #40 (comment) 2020-08-13 12:08:24 +02:00
Miriam Baglioni 69bbb9592a apply changes in #40 (comment) 2020-08-13 12:07:39 +02:00
Miriam Baglioni 945323299a apply changes in #40 (comment) 2020-08-13 12:07:24 +02:00
Miriam Baglioni e04c993247 apply changes in #40 (comment) 2020-08-13 12:07:07 +02:00
Miriam Baglioni ed0812d0ce apply changes in #40 (comment) 2020-08-13 12:06:49 +02:00
Miriam Baglioni d55cfe0ea5 apply changes in #40 (comment) 2020-08-13 12:06:20 +02:00
Miriam Baglioni 80866bec7d apply changes in #40 (comment) 2020-08-13 12:06:05 +02:00
Miriam Baglioni 1400978c0a apply changes in #40 (comment) 2020-08-13 12:05:44 +02:00
Miriam Baglioni 7b941a2e0a apply changes in #40 (comment) 2020-08-13 12:05:17 +02:00
Miriam Baglioni f7474f50fe apply changes in #40 (comment) 2020-08-13 12:04:52 +02:00
Miriam Baglioni 367203f412 apply changes in #40 (comment) 2020-08-13 12:04:33 +02:00
Miriam Baglioni 3ab4809d31 apply changes in #40 (comment) 2020-08-13 12:04:10 +02:00
Miriam Baglioni 02a4986e7b Applying changed from code reviews #40 (comment) and #40 (comment) and #40 (comment) 2020-08-13 11:53:01 +02:00
Miriam Baglioni 235d4e4d6e moved Context as relevant for Communities dump 2020-08-12 18:16:45 +02:00
Miriam Baglioni adf9f96a67 test for extraction of relation between organizations and context 2020-08-12 10:04:47 +02:00
Miriam Baglioni 7400cd019d removed not needed variable 2020-08-12 10:03:33 +02:00
Miriam Baglioni 98d28bab5c fixed missing _ in context nsprefix 2020-08-12 10:00:18 +02:00
Miriam Baglioni 8f48cb29f4 changed resource because of a change in the XQuery that returned the XML to be parsed. The main Zenodo community is no more a separate element, but part of the <zenodocommunities> element 2020-08-11 18:04:38 +02:00
Miriam Baglioni c3672b162b merge branch with master 2020-08-11 17:53:04 +02:00
Miriam Baglioni a16bbf3202 changed test resource to mirror change in the Xquery that produced data to be parsed. The main Zenodo community it is no more provided in a different element, but it is part of the <zenodocommunities> 2020-08-11 17:48:44 +02:00
Miriam Baglioni 25f4fbceea draft of test and resources 2020-08-11 17:37:22 +02:00
Miriam Baglioni 30a2b19b65 changed metadata for deposition od covid-19 dump in Zenodo 2020-08-11 17:36:56 +02:00
Claudio Atzori f7cc52ab02 Merge pull request 'enrichment_wfs' (#39) from enrichment_wfs into master
LGTM
2020-08-11 17:26:13 +02:00
Miriam Baglioni 49788b532a changed to mirror changes in the schema 2020-08-11 16:05:03 +02:00
Miriam Baglioni b08511287b - 2020-08-11 16:01:36 +02:00
Miriam Baglioni 7e81a17068 changed the XQUERY to mirror the change in the code 2020-08-11 16:00:33 +02:00
Miriam Baglioni 37ad2f28e9 removed added | in prefix for datasource 2020-08-11 15:55:06 +02:00
Miriam Baglioni f31c2e9461 enabled test 2020-08-11 15:49:25 +02:00
Miriam Baglioni 2d67476417 merge branch with master 2020-08-11 15:46:04 +02:00
Miriam Baglioni 77a390878c merge upstream 2020-08-11 15:45:48 +02:00
Miriam Baglioni 6d3804e24c - 2020-08-11 15:45:12 +02:00
Miriam Baglioni 0603ec4757 changed test to upload the dump for covid-19 community 2020-08-11 15:43:25 +02:00
Miriam Baglioni 7dfd56df9d - 2020-08-11 15:42:35 +02:00
Miriam Baglioni a169d7e7c1 added test file for the MakeTar class 2020-08-11 15:40:41 +02:00
Miriam Baglioni acb0926b2e json schemas for the dumped entities and relation 2020-08-11 15:39:48 +02:00
Miriam Baglioni ff52c51f92 added the communityMapPath parameter and removed the isLookUpUrl parameter 2020-08-11 15:39:22 +02:00
Miriam Baglioni 6f43acda5e added the maketar and send to zenodo step. Adjusted wf parameters 2020-08-11 15:38:20 +02:00
Miriam Baglioni ddc19de2e9 removed the isLookUpUrl among the parameters 2020-08-11 15:37:47 +02:00
Miriam Baglioni 592a8ea573 added parameter file for maketar class 2020-08-11 15:37:14 +02:00
Miriam Baglioni 77a0951b32 added the make archive step in the workflow 2020-08-11 15:32:32 +02:00
Miriam Baglioni cf4d918787 added description, changed parameter name and added method 2020-08-11 15:27:31 +02:00
Miriam Baglioni dc5fc5366d Creation of an archive for each related dump part 2020-08-11 15:26:06 +02:00
Miriam Baglioni 0ce49049d6 added description 2020-08-11 15:25:11 +02:00
Miriam Baglioni 9bae991167 added description of the class 2020-08-11 11:20:43 +02:00
Miriam Baglioni 341dc59ead removed the repartition(1). Added code for the creation of an archive containing all the parts dumped for each community 2020-08-11 11:18:58 +02:00
Sandro La Bruzzo fe8d640aee fixed error on oozie workflow 2020-08-11 09:43:03 +02:00
Sandro La Bruzzo 304590e854 updated workflow of indexing to start from begin 2020-08-11 09:17:47 +02:00
Sandro La Bruzzo eaf0dc68a2 fixed indexing 2020-08-11 09:17:03 +02:00
Miriam Baglioni 1991a49f70 removed reference to isLookUp to get the communityMap 2020-08-10 18:02:56 +02:00
Miriam Baglioni c378c38546 disabled test. The testing functionalities for hte upload in Zenode are moved to common 2020-08-10 12:41:11 +02:00
Miriam Baglioni 63ad0ed209 changed to use communityMapPath instead of IsLookUp 2020-08-10 12:40:19 +02:00
Miriam Baglioni cec795f2ea changed resources to mirror changes in the model 2020-08-10 12:39:35 +02:00
Miriam Baglioni f50e3e7333 changed the class for which to generate the schema 2020-08-10 12:03:49 +02:00
Miriam Baglioni b8c26f656c test using communityMapPath instead of isLookUp 2020-08-10 12:02:55 +02:00
Miriam Baglioni fe88904df0 changed the wf definition 2020-08-10 12:01:14 +02:00