Commit Graph

22 Commits

Author SHA1 Message Date
miconis 451114418d implementation of the instance type comparator and its tests 2021-11-04 15:20:57 +01:00
Sandro La Bruzzo b6c4f4acf3 upgraded maven version of commons-lang 2020-02-10 12:38:40 +01:00
miconis eeeb374480 minor changes in comparators 2020-01-24 10:01:11 +01:00
miconis 6a27fb14a8 update in the implementation of the tree: addition of new logic aggregations and statistics 2020-01-14 11:42:43 +02:00
miconis 5676e625bd implementation of romansmatch and re-implementation of the getNumber function. New terms in the translation map and update of the configuration 2019-11-28 16:54:44 +01:00
miconis 79e62787cf jarowinklernormalizedname splitted in 3 different comparators: citymatch, keywordmatch and jarowinkler. Implementation of the TreeStatistic support functions 2019-11-20 10:45:00 +01:00
miconis 5b3adb3e65 code cleaning, distribution of the classes in packages and implementation of the new configuration 2019-11-07 12:47:12 +01:00
miconis 1cbb48f77b minor changes 2019-10-08 16:49:07 +02:00
miconis 03c1b334d5 translation map moved in json configuration, support for synonyms added in the configuration, now the configuration is argument of conditions, distancealgos and clusteringfunctions 2019-10-08 14:53:52 +02:00
miconis f0b4c4cbd4 addition of a fixSpecial function to address the problem with special character in organization names, addition of new terms in translation maps 2019-08-06 17:06:05 +02:00
miconis 84974dcdfa restyling of the JaroWinklerNormalizedName comparator, now it is optimized. Addition of some translations in the translation maps, addition of a clustering based on keywords in organizations legalnames 2019-07-19 17:10:29 +02:00
miconis 0509ea8d1e bug fixing in the keywordsclustering class 2019-07-08 11:01:49 +02:00
miconis 2b866cfbeb addition of doi normalization in PidMatch comparator, addition of keywordsclustering (clustering based on terms in the translation maps for the organizations), minor changes 2019-07-08 09:44:02 +02:00
Michele De Bonis f87790f701 update of the comparator for legalnames of organizations 2019-03-21 14:27:27 +01:00
Michele De Bonis 9ff83d6567 implementation of the decision tree for the deduplication of the authors, implementation of multiple comparators to be used in a tree node and definition of the proto for person entity 2018-12-20 09:54:41 +01:00
Michele De Bonis 0bd20c565a implementation of the decisional tree, addition of the dnet-openaire-data-protos module, definition of the person proto, blockprocessor and paceconfig modified with addition of support for the tree processing 2018-12-12 16:30:03 +01:00
Michele De Bonis 23c5a16525 addition of cities check 2018-11-16 16:11:03 +01:00
Michele De Bonis 3a517a6551 Merge branch 'master' of https://github.com/dnet-team/dnet-dedup 2018-11-12 14:11:26 +01:00
Michele De Bonis 33387a3532 configuration file updated, addition of condition on domain 2018-11-12 14:11:15 +01:00
Claudio Atzori 925a437597 getting rid of spark libs from dnet-pace-core 2018-11-12 12:46:06 +01:00
Michele De Bonis 4337e83950 implementation of JaroWinklerNormalizedName, addition of various stopwords in different languages and configuration test 2018-11-05 17:22:59 +01:00
Sandro La Bruzzo a043d0c716 added d-net pace core module and ignored target folder 2018-10-02 10:37:54 +02:00