Commit Graph

  • 7cd63a37cf update of the strict configuration with new terms miconis 2019-11-29 14:13:46 +0100
  • 5676e625bd implementation of romansmatch and re-implementation of the getNumber function. New terms in the translation map and update of the configuration miconis 2019-11-28 16:54:44 +0100
  • 493b385b5b addition of one term to the translation maps in the configurations miconis 2019-11-27 15:48:37 +0100
  • c72f48fb33 minor change in the citymatch miconis 2019-11-21 10:54:02 +0100
  • 40808200f0 the param map has been updated: now it accepts string parameters miconis 2019-11-21 09:37:56 +0100
  • 79e62787cf jarowinklernormalizedname splitted in 3 different comparators: citymatch, keywordmatch and jarowinkler. Implementation of the TreeStatistic support functions miconis 2019-11-20 10:45:00 +0100
  • 676e9c8e37 code cleaning and implementation of the TreeDedup + minor changes miconis 2019-11-14 10:01:21 +0100
  • 1414badaaf
    Merge 393c1999f8 into bc7dd4bfa2 #1 dependabot[bot] 2019-11-13 08:27:18 +0000
  • 393c1999f8
    Bump jackson-databind from 2.6.6 to 2.9.10.1 dependabot/maven/com.fasterxml.jackson.core-jackson-databind-2.9.10.1 dependabot[bot] 2019-11-13 08:27:16 +0000
  • 5b3adb3e65 code cleaning, distribution of the classes in packages and implementation of the new configuration miconis 2019-11-07 12:47:12 +0100
  • 3ff5be675b put the last modification of the master branch into the tree2. Addition of the configuration as parameter of the comparator. This is to allow the comparator to access it miconis 2019-10-29 16:38:42 +0100
  • 8564fdd19c minor changes miconis 2019-10-29 15:58:21 +0100
  • bc7dd4bfa2 [maven-release-plugin] prepare for next development iteration miconis 2019-10-24 11:34:19 +0200
  • 098c5e2f64 [maven-release-plugin] prepare release dnet-dedup-3.0.15 dnet-dedup-3.0.15 miconis 2019-10-24 11:34:12 +0200
  • 8dba7a04f8 dependency-reduced-pom deleted miconis 2019-10-24 11:28:20 +0200
  • 58f128d861 Revert "[maven-release-plugin] prepare release dnet-dedup-3.0.15" miconis 2019-10-24 11:23:01 +0200
  • 452ab7892d [maven-release-plugin] prepare release dnet-dedup-3.0.15 miconis 2019-10-24 11:17:07 +0200
  • 4712fef82f release rollback miconis 2019-10-24 11:11:07 +0200
  • 4874038f8e minor changes miconis 2019-10-23 16:37:20 +0200
  • 2ffaa235a2 minor changes and configuration updates (synonym field added) miconis 2019-10-23 16:31:45 +0200
  • 1cbb48f77b minor changes miconis 2019-10-08 16:49:07 +0200
  • 7998f37ce1 normalization of the term in the translation map added miconis 2019-10-08 15:13:45 +0200
  • 03c1b334d5 translation map moved in json configuration, support for synonyms added in the configuration, now the configuration is argument of conditions, distancealgos and clusteringfunctions miconis 2019-10-08 14:53:52 +0200
  • 42e3bff05f [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-09-25 10:39:46 +0200
  • 259d502d70 [maven-release-plugin] prepare release dnet-dedup-3.0.14 dnet-dedup-3.0.14 Claudio Atzori 2019-09-25 10:39:39 +0200
  • fda7f1ce93 updated translation map and some tests Claudio Atzori 2019-09-25 10:15:13 +0200
  • 93b332cbe5 translation map updated miconis 2019-09-25 09:53:06 +0200
  • 3a92456fd0 optimize imports miconis 2019-08-09 15:42:41 +0200
  • 4bcf353a72 implementation of the conditions in tree nodes. get rid of the conditions part of the configuration miconis 2019-08-09 15:41:49 +0200
  • 72b14ec36b implementation of the decision tree. It takes place of the distance algos, necessaryConditions and sufficientConditions are still there. The model contains only path, type and name of the field. ignoreMissing is still in the model because it is used by the conditions. miconis 2019-08-09 10:08:34 +0200
  • cb51e017aa code refactoring: useless module removed miconis 2019-08-07 15:16:59 +0200
  • f0b4c4cbd4 addition of a fixSpecial function to address the problem with special character in organization names, addition of new terms in translation maps miconis 2019-08-06 17:06:05 +0200
  • 85070ce3fe addition of the BlockUtils class for meta-blocking, implementation of a new local test with edge filtering example miconis 2019-08-06 12:09:34 +0200
  • 2472f2b1e8 Merge branch 'master' of https://github.com/dnet-team/dnet-dedup miconis 2019-07-19 17:10:53 +0200
  • 84974dcdfa restyling of the JaroWinklerNormalizedName comparator, now it is optimized. Addition of some translations in the translation maps, addition of a clustering based on keywords in organizations legalnames miconis 2019-07-19 17:10:29 +0200
  • 19468fa864 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-07-08 11:12:52 +0200
  • 953b78ab9b [maven-release-plugin] prepare release dnet-dedup-3.0.13 dnet-dedup-3.0.13 Claudio Atzori 2019-07-08 11:12:45 +0200
  • d5d228aef3 Merge branch 'master' of https://github.com/dnet-team/dnet-dedup miconis 2019-07-08 11:02:29 +0200
  • 0509ea8d1e bug fixing in the keywordsclustering class miconis 2019-07-08 11:01:49 +0200
  • ceaf19c83c [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-07-08 10:11:24 +0200
  • 6314f896d1 [maven-release-plugin] prepare release dnet-dedup-3.0.12 dnet-dedup-3.0.12 Claudio Atzori 2019-07-08 10:11:17 +0200
  • 8f5bc52ab2 [maven-release-plugin] rollback the release of dnet-dedup-3.0.12 miconis 2019-07-08 10:00:48 +0200
  • 813778d647 [maven-release-plugin] prepare for next development iteration miconis 2019-07-08 09:48:10 +0200
  • b8fb3e46aa [maven-release-plugin] prepare release dnet-dedup-3.0.12 miconis 2019-07-08 09:48:03 +0200
  • 2b866cfbeb addition of doi normalization in PidMatch comparator, addition of keywordsclustering (clustering based on terms in the translation maps for the organizations), minor changes miconis 2019-07-08 09:44:02 +0200
  • 9f6fb0e030 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-06-19 10:02:39 +0200
  • 07d1b7df15 [maven-release-plugin] prepare release dnet-dedup-3.0.11 dnet-dedup-3.0.11 Claudio Atzori 2019-06-19 10:02:32 +0200
  • c7963d5afc optimized classpath resolvers Claudio Atzori 2019-06-19 10:01:35 +0200
  • c9fc377712 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-06-18 14:46:34 +0200
  • e1ee2d40b3 [maven-release-plugin] prepare release dnet-dedup-3.0.10 dnet-dedup-3.0.10 Claudio Atzori 2019-06-18 14:46:27 +0200
  • cbec51e922 avoid to divide by zero: in case of missing values, return undefined response Claudio Atzori 2019-06-18 14:45:15 +0200
  • 7063d286e0 cleanup Claudio Atzori 2019-06-18 14:44:42 +0200
  • e6944249ca Merge branch 'master' of https://github.com/dnet-team/dnet-dedup Claudio Atzori 2019-06-18 14:06:41 +0200
  • e7d170d0eb exact match condition gives undefined if a field is missing, ignoremissing semantics changed: now performs the comparison in any case if =true, if false gives -1 in case of missing miconis 2019-06-18 14:05:31 +0200
  • a5526f6254 implementation of the integration test, addition of document blocks to group entities after clustering miconis 2019-05-21 16:38:26 +0200
  • 6dcbfd9755 added more ignores Claudio Atzori 2019-04-03 17:43:55 +0200
  • 3dfbf5fab7 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-04-03 12:35:00 +0200
  • 6837b59c6e [maven-release-plugin] prepare release dnet-dedup-3.0.9 dnet-dedup-3.0.9 Claudio Atzori 2019-04-03 12:34:52 +0200
  • d4c5e293a6 [maven-release-plugin] rollback the release of dnet-dedup-3.0.9 miconis 2019-04-03 12:27:28 +0200
  • 4f4713c6aa [maven-release-plugin] prepare for next development iteration miconis 2019-04-03 12:26:05 +0200
  • bb072cec20 [maven-release-plugin] prepare release dnet-dedup-3.0.9 miconis 2019-04-03 12:25:56 +0200
  • 3018031621 branch cities merged into master miconis 2019-04-03 12:22:33 +0200
  • 14c3afba23 clean up miconis 2019-04-03 11:35:25 +0200
  • f738c2b641 addition of a sparktester test, implementation of 2 different classes for testing in dnet-dedup-test module, addition of new terms in the vocabulary and change in the implementation of the JaroWinklerNormalizedName comparator miconis 2019-04-03 09:40:14 +0200
  • e9894ed089 minor changes miconis 2019-03-26 15:48:21 +0100
  • 1dbb765343 minor changes miconis 2019-03-26 15:40:40 +0100
  • f87790f701 update of the comparator for legalnames of organizations Michele De Bonis 2019-03-21 14:27:27 +0100
  • 14a07ff400 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-02-18 09:09:14 +0100
  • d722368780 [maven-release-plugin] prepare release dnet-dedup-3.0.8 dnet-dedup-3.0.8 Claudio Atzori 2019-02-18 09:09:07 +0100
  • 27eeeec1f3 default configuration includes configurationId Claudio Atzori 2019-02-18 09:07:23 +0100
  • 63e1607d5c [maven-release-plugin] prepare for next development iteration Claudio Atzori 2019-02-17 12:56:19 +0100
  • 1b8d257036 [maven-release-plugin] prepare release dnet-dedup-3.0.7 dnet-dedup-3.0.7 Claudio Atzori 2019-02-17 12:56:11 +0100
  • cabc2d21c2 replace existing attributes when loading default configuration Claudio Atzori 2019-02-17 12:48:25 +0100
  • b02aa08833 implementation of the test classes and minor changes Michele De Bonis 2019-02-08 12:56:47 +0100
  • babf67663b modification of the README Michele De Bonis 2018-12-20 11:05:08 +0100
  • d9372745f2 modification of the README Michele De Bonis 2018-12-20 10:59:22 +0100
  • f91220980a modification of the README Michele De Bonis 2018-12-20 10:57:17 +0100
  • 0e3ce0100c modification of the README Michele De Bonis 2018-12-20 10:53:32 +0100
  • 07315ed492 modification of the README Michele De Bonis 2018-12-20 10:51:53 +0100
  • 2be03ecce9 modification of the README Michele De Bonis 2018-12-20 10:50:05 +0100
  • 6e9bf11e2d modification of the README Michele De Bonis 2018-12-20 10:47:56 +0100
  • 9ff83d6567 implementation of the decision tree for the deduplication of the authors, implementation of multiple comparators to be used in a tree node and definition of the proto for person entity Michele De Bonis 2018-12-20 09:54:41 +0100
  • 0bd20c565a implementation of the decisional tree, addition of the dnet-openaire-data-protos module, definition of the person proto, blockprocessor and paceconfig modified with addition of support for the tree processing Michele De Bonis 2018-12-12 16:30:03 +0100
  • d72960f8b9 apply limits (length, size) to pace Fields Claudio Atzori 2018-11-20 10:51:38 +0100
  • 1ff5be3f04 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2018-11-19 17:41:45 +0100
  • 31b228d38b [maven-release-plugin] prepare release dnet-dedup-3.0.6 dnet-dedup-3.0.6 Claudio Atzori 2018-11-19 17:41:37 +0100
  • 75c3daf38c using released mapping-utils module Claudio Atzori 2018-11-19 17:39:28 +0100
  • e5a77f0a53 added new properties to FieldDef (size, length) to limit the information mapped onto each MapDocument Claudio Atzori 2018-11-19 17:37:57 +0100
  • db37cce4a4 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2018-11-17 09:13:16 +0100
  • 4deac3f1f3 [maven-release-plugin] prepare release dnet-dedup-3.0.5 dnet-dedup-3.0.5 Claudio Atzori 2018-11-17 09:13:09 +0100
  • a0e0df1cfd added distance function fot software titles Claudio Atzori 2018-11-17 09:11:38 +0100
  • 23c5a16525 addition of cities check Michele De Bonis 2018-11-16 16:11:03 +0100
  • caf5ead565 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2018-11-16 09:18:00 +0100
  • 4d139bbc18 [maven-release-plugin] prepare release dnet-dedup-3.0.4 dnet-dedup-3.0.4 Claudio Atzori 2018-11-16 09:17:53 +0100
  • fa657a05e6 default (empty) configuration should be aligned with the updated model Claudio Atzori 2018-11-15 16:52:56 +0100
  • e4ae7d426a less verbose logging Claudio Atzori 2018-11-13 09:07:45 +0100
  • 9a14b0ecbc propagate exceptions in case of serialization errors, removed configuration pretty printing, removed unused class ScoredResult Claudio Atzori 2018-11-12 15:52:18 +0100
  • 71fe456a62 [maven-release-plugin] prepare for next development iteration Claudio Atzori 2018-11-12 14:23:36 +0100
  • 690bfcef1e [maven-release-plugin] prepare release dnet-dedup-3.0.3 dnet-dedup-3.0.3 Claudio Atzori 2018-11-12 14:23:29 +0100
  • 4a5f13c8f5 added more ignores Claudio Atzori 2018-11-12 14:22:19 +0100