miconis
|
3a92456fd0
|
optimize imports
|
2019-08-09 15:42:41 +02:00 |
miconis
|
4bcf353a72
|
implementation of the conditions in tree nodes. get rid of the conditions part of the configuration
|
2019-08-09 15:41:49 +02:00 |
miconis
|
72b14ec36b
|
implementation of the decision tree. It takes place of the distance algos, necessaryConditions and sufficientConditions are still there. The model contains only path, type and name of the field. ignoreMissing is still in the model because it is used by the conditions.
|
2019-08-09 10:08:34 +02:00 |
miconis
|
f0b4c4cbd4
|
addition of a fixSpecial function to address the problem with special character in organization names, addition of new terms in translation maps
|
2019-08-06 17:06:05 +02:00 |
miconis
|
85070ce3fe
|
addition of the BlockUtils class for meta-blocking, implementation of a new local test with edge filtering example
|
2019-08-06 12:09:34 +02:00 |
miconis
|
2472f2b1e8
|
Merge branch 'master' of https://github.com/dnet-team/dnet-dedup
|
2019-07-19 17:10:53 +02:00 |
miconis
|
84974dcdfa
|
restyling of the JaroWinklerNormalizedName comparator, now it is optimized. Addition of some translations in the translation maps, addition of a clustering based on keywords in organizations legalnames
|
2019-07-19 17:10:29 +02:00 |
Claudio Atzori
|
19468fa864
|
[maven-release-plugin] prepare for next development iteration
|
2019-07-08 11:12:52 +02:00 |
Claudio Atzori
|
953b78ab9b
|
[maven-release-plugin] prepare release dnet-dedup-3.0.13
|
2019-07-08 11:12:45 +02:00 |
miconis
|
d5d228aef3
|
Merge branch 'master' of https://github.com/dnet-team/dnet-dedup
|
2019-07-08 11:02:29 +02:00 |
miconis
|
0509ea8d1e
|
bug fixing in the keywordsclustering class
|
2019-07-08 11:01:49 +02:00 |
Claudio Atzori
|
ceaf19c83c
|
[maven-release-plugin] prepare for next development iteration
|
2019-07-08 10:11:24 +02:00 |
Claudio Atzori
|
6314f896d1
|
[maven-release-plugin] prepare release dnet-dedup-3.0.12
|
2019-07-08 10:11:17 +02:00 |
miconis
|
8f5bc52ab2
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.12
|
2019-07-08 10:00:48 +02:00 |
miconis
|
813778d647
|
[maven-release-plugin] prepare for next development iteration
|
2019-07-08 09:48:10 +02:00 |
miconis
|
b8fb3e46aa
|
[maven-release-plugin] prepare release dnet-dedup-3.0.12
|
2019-07-08 09:48:03 +02:00 |
miconis
|
2b866cfbeb
|
addition of doi normalization in PidMatch comparator, addition of keywordsclustering (clustering based on terms in the translation maps for the organizations), minor changes
|
2019-07-08 09:44:02 +02:00 |
Claudio Atzori
|
9f6fb0e030
|
[maven-release-plugin] prepare for next development iteration
|
2019-06-19 10:02:39 +02:00 |
Claudio Atzori
|
07d1b7df15
|
[maven-release-plugin] prepare release dnet-dedup-3.0.11
|
2019-06-19 10:02:32 +02:00 |
Claudio Atzori
|
c7963d5afc
|
optimized classpath resolvers
|
2019-06-19 10:01:35 +02:00 |
Claudio Atzori
|
c9fc377712
|
[maven-release-plugin] prepare for next development iteration
|
2019-06-18 14:46:34 +02:00 |
Claudio Atzori
|
e1ee2d40b3
|
[maven-release-plugin] prepare release dnet-dedup-3.0.10
|
2019-06-18 14:46:27 +02:00 |
Claudio Atzori
|
cbec51e922
|
avoid to divide by zero: in case of missing values, return undefined response
|
2019-06-18 14:45:15 +02:00 |
Claudio Atzori
|
7063d286e0
|
cleanup
|
2019-06-18 14:44:42 +02:00 |
miconis
|
e7d170d0eb
|
exact match condition gives undefined if a field is missing, ignoremissing semantics changed: now performs the comparison in any case if =true, if false gives -1 in case of missing
|
2019-06-18 14:05:31 +02:00 |
miconis
|
a5526f6254
|
implementation of the integration test, addition of document blocks to group entities after clustering
|
2019-05-21 16:38:26 +02:00 |
Claudio Atzori
|
3dfbf5fab7
|
[maven-release-plugin] prepare for next development iteration
|
2019-04-03 12:35:00 +02:00 |
Claudio Atzori
|
6837b59c6e
|
[maven-release-plugin] prepare release dnet-dedup-3.0.9
|
2019-04-03 12:34:52 +02:00 |
miconis
|
d4c5e293a6
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.9
|
2019-04-03 12:27:28 +02:00 |
miconis
|
4f4713c6aa
|
[maven-release-plugin] prepare for next development iteration
|
2019-04-03 12:26:05 +02:00 |
miconis
|
bb072cec20
|
[maven-release-plugin] prepare release dnet-dedup-3.0.9
|
2019-04-03 12:25:56 +02:00 |
miconis
|
3018031621
|
branch cities merged into master
|
2019-04-03 12:22:33 +02:00 |
miconis
|
f738c2b641
|
addition of a sparktester test, implementation of 2 different classes for testing in dnet-dedup-test module, addition of new terms in the vocabulary and change in the implementation of the JaroWinklerNormalizedName comparator
|
2019-04-03 09:40:14 +02:00 |
miconis
|
e9894ed089
|
minor changes
|
2019-03-26 15:48:21 +01:00 |
Michele De Bonis
|
f87790f701
|
update of the comparator for legalnames of organizations
|
2019-03-21 14:27:27 +01:00 |
Claudio Atzori
|
14a07ff400
|
[maven-release-plugin] prepare for next development iteration
|
2019-02-18 09:09:14 +01:00 |
Claudio Atzori
|
d722368780
|
[maven-release-plugin] prepare release dnet-dedup-3.0.8
|
2019-02-18 09:09:07 +01:00 |
Claudio Atzori
|
27eeeec1f3
|
default configuration includes configurationId
|
2019-02-18 09:07:23 +01:00 |
Claudio Atzori
|
63e1607d5c
|
[maven-release-plugin] prepare for next development iteration
|
2019-02-17 12:56:19 +01:00 |
Claudio Atzori
|
1b8d257036
|
[maven-release-plugin] prepare release dnet-dedup-3.0.7
|
2019-02-17 12:56:11 +01:00 |
Claudio Atzori
|
cabc2d21c2
|
replace existing attributes when loading default configuration
|
2019-02-17 12:48:25 +01:00 |
Michele De Bonis
|
b02aa08833
|
implementation of the test classes and minor changes
|
2019-02-08 12:56:47 +01:00 |
Michele De Bonis
|
9ff83d6567
|
implementation of the decision tree for the deduplication of the authors, implementation of multiple comparators to be used in a tree node and definition of the proto for person entity
|
2018-12-20 09:54:41 +01:00 |
Michele De Bonis
|
0bd20c565a
|
implementation of the decisional tree, addition of the dnet-openaire-data-protos module, definition of the person proto, blockprocessor and paceconfig modified with addition of support for the tree processing
|
2018-12-12 16:30:03 +01:00 |
Claudio Atzori
|
d72960f8b9
|
apply limits (length, size) to pace Fields
|
2018-11-20 10:51:38 +01:00 |
Claudio Atzori
|
1ff5be3f04
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-19 17:41:45 +01:00 |
Claudio Atzori
|
31b228d38b
|
[maven-release-plugin] prepare release dnet-dedup-3.0.6
|
2018-11-19 17:41:37 +01:00 |
Claudio Atzori
|
e5a77f0a53
|
added new properties to FieldDef (size, length) to limit the information mapped onto each MapDocument
|
2018-11-19 17:37:57 +01:00 |
Claudio Atzori
|
db37cce4a4
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-17 09:13:16 +01:00 |
Claudio Atzori
|
4deac3f1f3
|
[maven-release-plugin] prepare release dnet-dedup-3.0.5
|
2018-11-17 09:13:09 +01:00 |
Claudio Atzori
|
a0e0df1cfd
|
added distance function fot software titles
|
2018-11-17 09:11:38 +01:00 |
Michele De Bonis
|
23c5a16525
|
addition of cities check
|
2018-11-16 16:11:03 +01:00 |
Claudio Atzori
|
caf5ead565
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-16 09:18:00 +01:00 |
Claudio Atzori
|
4d139bbc18
|
[maven-release-plugin] prepare release dnet-dedup-3.0.4
|
2018-11-16 09:17:53 +01:00 |
Claudio Atzori
|
fa657a05e6
|
default (empty) configuration should be aligned with the updated model
|
2018-11-15 16:52:56 +01:00 |
Claudio Atzori
|
e4ae7d426a
|
less verbose logging
|
2018-11-13 09:07:45 +01:00 |
Claudio Atzori
|
9a14b0ecbc
|
propagate exceptions in case of serialization errors, removed configuration pretty printing, removed unused class ScoredResult
|
2018-11-12 15:52:18 +01:00 |
Claudio Atzori
|
71fe456a62
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 14:23:36 +01:00 |
Claudio Atzori
|
690bfcef1e
|
[maven-release-plugin] prepare release dnet-dedup-3.0.3
|
2018-11-12 14:23:29 +01:00 |
Michele De Bonis
|
3a517a6551
|
Merge branch 'master' of https://github.com/dnet-team/dnet-dedup
|
2018-11-12 14:11:26 +01:00 |
Michele De Bonis
|
33387a3532
|
configuration file updated, addition of condition on domain
|
2018-11-12 14:11:15 +01:00 |
Claudio Atzori
|
1f9b908d6c
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 12:46:50 +01:00 |
Claudio Atzori
|
99379e2505
|
[maven-release-plugin] prepare release dnet-dedup-3.0.2
|
2018-11-12 12:46:42 +01:00 |
Claudio Atzori
|
925a437597
|
getting rid of spark libs from dnet-pace-core
|
2018-11-12 12:46:06 +01:00 |
Claudio Atzori
|
c7d6b1a41a
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 11:40:42 +01:00 |
Claudio Atzori
|
4c69ddd384
|
[maven-release-plugin] prepare release dnet-dedup-3.0.1
|
2018-11-12 11:40:34 +01:00 |
Claudio Atzori
|
d850ba26c1
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.1
|
2018-11-12 11:39:07 +01:00 |
Claudio Atzori
|
70f80334d8
|
[maven-release-plugin] prepare release dnet-dedup-3.0.1
|
2018-11-12 11:38:52 +01:00 |
Claudio Atzori
|
7943d4bb6b
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.1
|
2018-11-12 11:28:28 +01:00 |
Claudio Atzori
|
18944f8b5f
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 11:24:06 +01:00 |
Claudio Atzori
|
5ec9e552fe
|
[maven-release-plugin] prepare release dnet-dedup-3.0.1
|
2018-11-12 11:23:57 +01:00 |
Michele De Bonis
|
c84b5005e6
|
configuration files changed: dedupRun instead of run, assertion updated in tests
|
2018-11-06 11:02:00 +01:00 |
Michele De Bonis
|
5d81c04d0b
|
deleted useless imports
|
2018-11-06 09:48:22 +01:00 |
Michele De Bonis
|
4337e83950
|
implementation of JaroWinklerNormalizedName, addition of various stopwords in different languages and configuration test
|
2018-11-05 17:22:59 +01:00 |
Claudio Atzori
|
9f513352fb
|
added DiffPatchMatch utility. Resumed commented tests!
|
2018-10-31 10:49:11 +01:00 |
Michele De Bonis
|
7c59c3ebf0
|
serialization test added. useless getter methods ignored by json serialization
|
2018-10-29 16:16:11 +01:00 |
Michele De Bonis
|
0d03030694
|
DedupConf parsed using Jackson library
|
2018-10-29 11:13:55 +01:00 |
Michele De Bonis
|
0375f1cec9
|
implementation of the toString methonds changed: from Gson to Jackson
|
2018-10-26 14:55:59 +02:00 |
Michele De Bonis
|
d059bf68b8
|
modification in the initialization of clustering functions, distance algos and conditions.
|
2018-10-25 15:15:40 +02:00 |
Michele De Bonis
|
1d678ddc9c
|
update in the discovery of clustering, conditions and distance functions (annotated with custom annotations)
|
2018-10-24 12:09:41 +02:00 |
Claudio Atzori
|
bc4505e0e6
|
revised PidMatch implementation, cleanup
|
2018-10-20 08:38:19 +02:00 |
Claudio Atzori
|
8cc925f017
|
[maven-release-plugin] prepare for next development iteration
|
2018-10-18 12:17:34 +02:00 |
Claudio Atzori
|
69e3811dc8
|
[maven-release-plugin] prepare release dnet-dedup-3.0.0
|
2018-10-18 12:17:27 +02:00 |
Claudio Atzori
|
b30cd0ccc3
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.0
|
2018-10-18 12:13:03 +02:00 |
Claudio Atzori
|
10b80a22ae
|
[maven-release-plugin] prepare release dnet-dedup-3.0.0
|
2018-10-18 12:12:45 +02:00 |
Claudio Atzori
|
40f93612fe
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.0
|
2018-10-18 12:00:45 +02:00 |
Claudio Atzori
|
9a60794c6f
|
[maven-release-plugin] prepare for next development iteration
|
2018-10-18 11:58:43 +02:00 |
Claudio Atzori
|
ff003d0fc0
|
[maven-release-plugin] prepare release dnet-dedup-3.0.0
|
2018-10-18 11:58:36 +02:00 |
Claudio Atzori
|
f27655e96c
|
updated maven project structure
|
2018-10-18 11:56:26 +02:00 |
Michele De Bonis
|
1f0eeaf7ab
|
update of the spark test
|
2018-10-18 10:12:44 +02:00 |
Sandro La Bruzzo
|
67e5f9858b
|
Added FSpark Implementation of dedup
|
2018-10-11 15:19:20 +02:00 |
Sandro La Bruzzo
|
d0edb7b773
|
Added First Implementation of Spark Test
|
2018-10-02 17:07:17 +02:00 |
Sandro La Bruzzo
|
a043d0c716
|
added d-net pace core module and ignored target folder
|
2018-10-02 10:37:54 +02:00 |