Commit Graph

168 Commits

Author SHA1 Message Date
miconis 2b866cfbeb addition of doi normalization in PidMatch comparator, addition of keywordsclustering (clustering based on terms in the translation maps for the organizations), minor changes 2019-07-08 09:44:02 +02:00
Claudio Atzori 9f6fb0e030 [maven-release-plugin] prepare for next development iteration 2019-06-19 10:02:39 +02:00
Claudio Atzori 07d1b7df15 [maven-release-plugin] prepare release dnet-dedup-3.0.11 2019-06-19 10:02:32 +02:00
Claudio Atzori c9fc377712 [maven-release-plugin] prepare for next development iteration 2019-06-18 14:46:34 +02:00
Claudio Atzori e1ee2d40b3 [maven-release-plugin] prepare release dnet-dedup-3.0.10 2019-06-18 14:46:27 +02:00
miconis e7d170d0eb exact match condition gives undefined if a field is missing, ignoremissing semantics changed: now performs the comparison in any case if =true, if false gives -1 in case of missing 2019-06-18 14:05:31 +02:00
miconis a5526f6254 implementation of the integration test, addition of document blocks to group entities after clustering 2019-05-21 16:38:26 +02:00
Claudio Atzori 3dfbf5fab7 [maven-release-plugin] prepare for next development iteration 2019-04-03 12:35:00 +02:00
Claudio Atzori 6837b59c6e [maven-release-plugin] prepare release dnet-dedup-3.0.9 2019-04-03 12:34:52 +02:00
miconis d4c5e293a6 [maven-release-plugin] rollback the release of dnet-dedup-3.0.9 2019-04-03 12:27:28 +02:00
miconis 4f4713c6aa [maven-release-plugin] prepare for next development iteration 2019-04-03 12:26:05 +02:00
miconis bb072cec20 [maven-release-plugin] prepare release dnet-dedup-3.0.9 2019-04-03 12:25:56 +02:00
miconis 3018031621 branch cities merged into master 2019-04-03 12:22:33 +02:00
miconis 14c3afba23 clean up 2019-04-03 11:35:25 +02:00
miconis f738c2b641 addition of a sparktester test, implementation of 2 different classes for testing in dnet-dedup-test module, addition of new terms in the vocabulary and change in the implementation of the JaroWinklerNormalizedName comparator 2019-04-03 09:40:14 +02:00
miconis e9894ed089 minor changes 2019-03-26 15:48:21 +01:00
miconis 1dbb765343 minor changes 2019-03-26 15:40:40 +01:00
Michele De Bonis f87790f701 update of the comparator for legalnames of organizations 2019-03-21 14:27:27 +01:00
Claudio Atzori 14a07ff400 [maven-release-plugin] prepare for next development iteration 2019-02-18 09:09:14 +01:00
Claudio Atzori d722368780 [maven-release-plugin] prepare release dnet-dedup-3.0.8 2019-02-18 09:09:07 +01:00
Claudio Atzori 63e1607d5c [maven-release-plugin] prepare for next development iteration 2019-02-17 12:56:19 +01:00
Claudio Atzori 1b8d257036 [maven-release-plugin] prepare release dnet-dedup-3.0.7 2019-02-17 12:56:11 +01:00
Michele De Bonis b02aa08833 implementation of the test classes and minor changes 2019-02-08 12:56:47 +01:00
Michele De Bonis 9ff83d6567 implementation of the decision tree for the deduplication of the authors, implementation of multiple comparators to be used in a tree node and definition of the proto for person entity 2018-12-20 09:54:41 +01:00
Michele De Bonis 0bd20c565a implementation of the decisional tree, addition of the dnet-openaire-data-protos module, definition of the person proto, blockprocessor and paceconfig modified with addition of support for the tree processing 2018-12-12 16:30:03 +01:00
Claudio Atzori d72960f8b9 apply limits (length, size) to pace Fields 2018-11-20 10:51:38 +01:00
Claudio Atzori 1ff5be3f04 [maven-release-plugin] prepare for next development iteration 2018-11-19 17:41:45 +01:00
Claudio Atzori 31b228d38b [maven-release-plugin] prepare release dnet-dedup-3.0.6 2018-11-19 17:41:37 +01:00
Claudio Atzori e5a77f0a53 added new properties to FieldDef (size, length) to limit the information mapped onto each MapDocument 2018-11-19 17:37:57 +01:00
Claudio Atzori db37cce4a4 [maven-release-plugin] prepare for next development iteration 2018-11-17 09:13:16 +01:00
Claudio Atzori 4deac3f1f3 [maven-release-plugin] prepare release dnet-dedup-3.0.5 2018-11-17 09:13:09 +01:00
Michele De Bonis 23c5a16525 addition of cities check 2018-11-16 16:11:03 +01:00
Claudio Atzori caf5ead565 [maven-release-plugin] prepare for next development iteration 2018-11-16 09:18:00 +01:00
Claudio Atzori 4d139bbc18 [maven-release-plugin] prepare release dnet-dedup-3.0.4 2018-11-16 09:17:53 +01:00
Claudio Atzori 71fe456a62 [maven-release-plugin] prepare for next development iteration 2018-11-12 14:23:36 +01:00
Claudio Atzori 690bfcef1e [maven-release-plugin] prepare release dnet-dedup-3.0.3 2018-11-12 14:23:29 +01:00
Michele De Bonis 3a517a6551 Merge branch 'master' of https://github.com/dnet-team/dnet-dedup 2018-11-12 14:11:26 +01:00
Michele De Bonis 33387a3532 configuration file updated, addition of condition on domain 2018-11-12 14:11:15 +01:00
Claudio Atzori 1f9b908d6c [maven-release-plugin] prepare for next development iteration 2018-11-12 12:46:50 +01:00
Claudio Atzori 99379e2505 [maven-release-plugin] prepare release dnet-dedup-3.0.2 2018-11-12 12:46:42 +01:00
Claudio Atzori c7d6b1a41a [maven-release-plugin] prepare for next development iteration 2018-11-12 11:40:42 +01:00
Claudio Atzori 4c69ddd384 [maven-release-plugin] prepare release dnet-dedup-3.0.1 2018-11-12 11:40:34 +01:00
Claudio Atzori d850ba26c1 [maven-release-plugin] rollback the release of dnet-dedup-3.0.1 2018-11-12 11:39:07 +01:00
Claudio Atzori 70f80334d8 [maven-release-plugin] prepare release dnet-dedup-3.0.1 2018-11-12 11:38:52 +01:00
Claudio Atzori 7943d4bb6b [maven-release-plugin] rollback the release of dnet-dedup-3.0.1 2018-11-12 11:28:28 +01:00
Claudio Atzori 18944f8b5f [maven-release-plugin] prepare for next development iteration 2018-11-12 11:24:06 +01:00
Claudio Atzori 5ec9e552fe [maven-release-plugin] prepare release dnet-dedup-3.0.1 2018-11-12 11:23:57 +01:00
Michele De Bonis c16d58e495 updated dnet-openaireplus-mapping-utils dependency 2018-11-06 12:09:35 +01:00
Michele De Bonis c84b5005e6 configuration files changed: dedupRun instead of run, assertion updated in tests 2018-11-06 11:02:00 +01:00
Michele De Bonis 5d81c04d0b deleted useless imports 2018-11-06 09:48:22 +01:00
Michele De Bonis 4337e83950 implementation of JaroWinklerNormalizedName, addition of various stopwords in different languages and configuration test 2018-11-05 17:22:59 +01:00
Claudio Atzori 9f513352fb added DiffPatchMatch utility. Resumed commented tests! 2018-10-31 10:49:11 +01:00
Michele De Bonis 1d678ddc9c update in the discovery of clustering, conditions and distance functions (annotated with custom annotations) 2018-10-24 12:09:41 +02:00
Claudio Atzori bc4505e0e6 revised PidMatch implementation, cleanup 2018-10-20 08:38:19 +02:00
Claudio Atzori 0bab8cf704 tests and relative resources migrated from openaire-mapping-utils 2018-10-18 15:30:51 +02:00
Claudio Atzori 8cc925f017 [maven-release-plugin] prepare for next development iteration 2018-10-18 12:17:34 +02:00
Claudio Atzori 69e3811dc8 [maven-release-plugin] prepare release dnet-dedup-3.0.0 2018-10-18 12:17:27 +02:00
Claudio Atzori b30cd0ccc3 [maven-release-plugin] rollback the release of dnet-dedup-3.0.0 2018-10-18 12:13:03 +02:00
Claudio Atzori 10b80a22ae [maven-release-plugin] prepare release dnet-dedup-3.0.0 2018-10-18 12:12:45 +02:00
Claudio Atzori 40f93612fe [maven-release-plugin] rollback the release of dnet-dedup-3.0.0 2018-10-18 12:00:45 +02:00
Claudio Atzori 9a60794c6f [maven-release-plugin] prepare for next development iteration 2018-10-18 11:58:43 +02:00
Claudio Atzori ff003d0fc0 [maven-release-plugin] prepare release dnet-dedup-3.0.0 2018-10-18 11:58:36 +02:00
Claudio Atzori f27655e96c updated maven project structure 2018-10-18 11:56:26 +02:00
Michele De Bonis 1f0eeaf7ab update of the spark test 2018-10-18 10:12:44 +02:00
Sandro La Bruzzo 674ea3909f Added First Spark Implementation of dedup 2018-10-12 12:53:47 +02:00
Sandro La Bruzzo 67e5f9858b Added FSpark Implementation of dedup 2018-10-11 15:19:20 +02:00
Sandro La Bruzzo d0edb7b773 Added First Implementation of Spark Test 2018-10-02 17:07:17 +02:00
Sandro La Bruzzo e65ecfcc3a added module to test dnet-pace-core 2018-10-02 10:55:29 +02:00