1
0
Fork 0
Commit Graph

5475 Commits

Author SHA1 Message Date
Claudio Atzori 97c239ee0d WIP: trying to find a way to build the records for the index 2020-01-16 12:02:28 +02:00
miconis 4955be0197 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-01-14 15:03:44 +02:00
miconis f61adfc2bb minor changes 2020-01-14 15:03:27 +02:00
miconis 9bdcb02179 minor changes and update of the configuration for publications 2020-01-14 15:01:03 +02:00
miconis 4dce785375 update in the implementation of the tree: addition of new logic aggregations and statistics 2020-01-14 11:42:43 +02:00
Michele Artini f7b9a7a9af entity migration (partial implementation) 2020-01-10 15:55:23 +01:00
Claudio Atzori 731f9b64e6 Merge branch 'master' of michele.artini/dnet-hadoop into master 2019-12-20 14:22:37 +01:00
Michele Artini 7229fecbcf fix warnings in poms 2019-12-20 13:41:08 +01:00
Sandro La Bruzzo dd21db7036 fixed stuff 2019-12-18 16:28:22 +01:00
miconis b3748b8d77 minor changes 2019-12-18 16:20:35 +01:00
miconis b21b1b8f61 implementation of new aggregation in the tree node processing 2019-12-18 16:19:36 +01:00
miconis 20fcfe6328 implementation of new aggregation in the tree node processing 2019-12-18 16:19:26 +01:00
Sandro La Bruzzo d924f28b93 fixed wrong use of jspath 2019-12-18 09:29:44 +01:00
Claudio Atzori 7ba586d2e5 oozie workflow aimed to build the adjacency lists representation of the graph, needed to build the records to be indexed 2019-12-17 16:24:49 +01:00
miconis 84aaa65501 implementation of new json comparator and update of the publication configuration 2019-12-17 09:16:26 +01:00
Sandro La Bruzzo 76efcde4fd using new branch decisionTreeDedup 2019-12-13 12:20:35 +01:00
Sandro La Bruzzo 5c01ae4c92 merged JqMapping branch into tree2 2019-12-13 11:30:02 +01:00
Sandro La Bruzzo b4392f9f43 implemented DedupRecord factory for missing entities 2019-12-13 09:40:02 +01:00
miconis 545e940007 implementation of the mergeFrom for the Datasources 2019-12-12 15:36:41 +01:00
Sandro La Bruzzo 39367676d7 implemented DedupRecord factory with the merge of project 2019-12-12 15:18:48 +01:00
Sandro La Bruzzo 6b45e37e22 implemented DedupRecord factory with the merge of organizations 2019-12-11 16:57:37 +01:00
Sandro La Bruzzo abd9034da0 implemented DedupRecord factory with the merge of publications 2019-12-11 15:43:24 +01:00
miconis 4b66b471a4 implementation of the sorting by trust mechanism and the merge of oaf entities 2019-12-10 14:57:16 +01:00
Sandro La Bruzzo 35008fdbf9 fix stuff 2019-12-06 15:28:30 +01:00
Sandro La Bruzzo cc63706347 Implemented deduplication on spark 2019-12-06 13:38:00 +01:00
Sandro La Bruzzo 16c670a5d5 Improved deduplication 2019-12-05 14:14:25 +01:00
miconis 49f9beb4a8 implementation of romansmatch and re-implementation of the getNumber function. New terms in the translation map and update of the configuration 2019-11-28 16:54:44 +01:00
miconis f791730330 addition of one term to the translation maps in the configurations 2019-11-27 15:48:37 +01:00
miconis d2278fe358 minor change in the citymatch 2019-11-21 10:54:02 +01:00
miconis 8c0d346005 the param map has been updated: now it accepts string parameters 2019-11-21 09:37:56 +01:00
miconis ddd40540aa jarowinklernormalizedname splitted in 3 different comparators: citymatch, keywordmatch and jarowinkler. Implementation of the TreeStatistic support functions 2019-11-20 10:45:00 +01:00
Claudio Atzori 6a7bee5e43 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2019-11-14 15:43:07 +01:00
Claudio Atzori 0c4b316f82 align Result model with the latest OpenAIRE schema changes introduced in the protobuf model 2019-11-14 15:42:52 +01:00
Sandro La Bruzzo aad0cb40b7 Added schema Scholexplorer 2019-11-14 10:34:09 +01:00
miconis c687956371 code cleaning and implementation of the TreeDedup + minor changes 2019-11-14 10:01:21 +01:00
Claudio Atzori 5711e75f67 use ${project.version} whenever possible 2019-11-08 17:41:51 +01:00
Claudio Atzori 245b4cbbb3 removed import limit 2019-11-08 17:41:01 +01:00
Claudio Atzori 7fe6835b47 [maven-release-plugin] prepare for next development iteration 2019-11-07 17:39:30 +01:00
Claudio Atzori 58918967d9 [maven-release-plugin] prepare release dhp-1.0.4 2019-11-07 17:39:27 +01:00
Claudio Atzori 2243089b78 Author PIDs include also provenance information 2019-11-07 17:38:37 +01:00
Claudio Atzori 5308f05a02 allow to speficy the target hive DB name in the infospace import workflow 2019-11-07 17:38:09 +01:00
miconis 0973899865 code cleaning, distribution of the classes in packages and implementation of the new configuration 2019-11-07 12:47:12 +01:00
Claudio Atzori a52d5bde4f simplified import procedure, maps the infospace as hive tables 2019-11-06 17:45:52 +01:00
Claudio Atzori 1e7a2ac41d align parmeter names, graph import procedure WIP 2019-11-04 17:41:01 +01:00
Claudio Atzori f39148dab8 [maven-release-plugin] prepare for next development iteration 2019-11-04 12:34:48 +01:00
Claudio Atzori 34b0e7b40a [maven-release-plugin] prepare release dhp-1.0.3 2019-11-04 12:34:46 +01:00
Claudio Atzori 439ad80d81 conversion utilities from protobuffer model to DHP model moved in dnet-mapreduce-jobs. Removed also the relative protobuf dependencies 2019-11-04 12:33:23 +01:00
Claudio Atzori 32ed4ae8d6 conversion utilities from protobuffer model to DHP model moved in dnet-mapreduce-jobs. Removed also the relative protobuf dependencies 2019-11-04 12:28:56 +01:00
Sandro La Bruzzo fd0ad82111 [maven-release-plugin] prepare for next development iteration 2019-10-31 12:08:51 +01:00
Sandro La Bruzzo f224613b40 [maven-release-plugin] prepare release dhp-1.0.2 2019-10-31 12:08:49 +01:00