Sandro La Bruzzo
|
37b84b6afa
|
Added description of the module
|
2019-03-13 14:53:35 +01:00 |
luosolo
|
1eb0281b38
|
refactored structure of the project
|
2019-03-13 14:43:20 +01:00 |
luosolo
|
c10770cd3e
|
added module that allows to collect data into HDFS
|
2019-03-12 15:40:55 +01:00 |
Claudio Atzori
|
f2394fcd9f
|
[maven-release-plugin] prepare for next development iteration
|
2019-02-18 09:09:14 +01:00 |
Claudio Atzori
|
722431dde1
|
[maven-release-plugin] prepare release dnet-dedup-3.0.8
|
2019-02-18 09:09:07 +01:00 |
Claudio Atzori
|
470c4b0f20
|
default configuration includes configurationId
|
2019-02-18 09:07:23 +01:00 |
Claudio Atzori
|
ccb7e83196
|
[maven-release-plugin] prepare for next development iteration
|
2019-02-17 12:56:19 +01:00 |
Claudio Atzori
|
7d8e62d4cc
|
[maven-release-plugin] prepare release dnet-dedup-3.0.7
|
2019-02-17 12:56:11 +01:00 |
Claudio Atzori
|
968cd47436
|
replace existing attributes when loading default configuration
|
2019-02-17 12:48:25 +01:00 |
Michele De Bonis
|
0735f3a822
|
implementation of the test classes and minor changes
|
2019-02-08 12:56:47 +01:00 |
Michele De Bonis
|
7a8d28991f
|
implementation of the decision tree for the deduplication of the authors, implementation of multiple comparators to be used in a tree node and definition of the proto for person entity
|
2018-12-20 09:54:41 +01:00 |
Michele De Bonis
|
39613dbbd6
|
implementation of the decisional tree, addition of the dnet-openaire-data-protos module, definition of the person proto, blockprocessor and paceconfig modified with addition of support for the tree processing
|
2018-12-12 16:30:03 +01:00 |
Claudio Atzori
|
f1c68d8ba3
|
apply limits (length, size) to pace Fields
|
2018-11-20 10:51:38 +01:00 |
Claudio Atzori
|
c5979ffe18
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-19 17:41:45 +01:00 |
Claudio Atzori
|
9869dff1d2
|
[maven-release-plugin] prepare release dnet-dedup-3.0.6
|
2018-11-19 17:41:37 +01:00 |
Claudio Atzori
|
c2d4cb3ba6
|
added new properties to FieldDef (size, length) to limit the information mapped onto each MapDocument
|
2018-11-19 17:37:57 +01:00 |
Claudio Atzori
|
394fcafd41
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-17 09:13:16 +01:00 |
Claudio Atzori
|
397554130c
|
[maven-release-plugin] prepare release dnet-dedup-3.0.5
|
2018-11-17 09:13:09 +01:00 |
Claudio Atzori
|
0dfb2ea600
|
added distance function fot software titles
|
2018-11-17 09:11:38 +01:00 |
Michele De Bonis
|
3d4372ced9
|
addition of cities check
|
2018-11-16 16:11:03 +01:00 |
Claudio Atzori
|
55a9b4f501
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-16 09:18:00 +01:00 |
Claudio Atzori
|
35ab630493
|
[maven-release-plugin] prepare release dnet-dedup-3.0.4
|
2018-11-16 09:17:53 +01:00 |
Claudio Atzori
|
399e4bc80f
|
default (empty) configuration should be aligned with the updated model
|
2018-11-15 16:52:56 +01:00 |
Claudio Atzori
|
59bab8dba4
|
less verbose logging
|
2018-11-13 09:07:45 +01:00 |
Claudio Atzori
|
478ad72cb8
|
propagate exceptions in case of serialization errors, removed configuration pretty printing, removed unused class ScoredResult
|
2018-11-12 15:52:18 +01:00 |
Claudio Atzori
|
f7616c7a8a
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 14:23:36 +01:00 |
Claudio Atzori
|
df4b871c8b
|
[maven-release-plugin] prepare release dnet-dedup-3.0.3
|
2018-11-12 14:23:29 +01:00 |
Michele De Bonis
|
72a9b3139e
|
Merge branch 'master' of https://github.com/dnet-team/dnet-dedup
|
2018-11-12 14:11:26 +01:00 |
Michele De Bonis
|
b5062f5429
|
configuration file updated, addition of condition on domain
|
2018-11-12 14:11:15 +01:00 |
Claudio Atzori
|
2a509b18fa
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 12:46:50 +01:00 |
Claudio Atzori
|
e247218987
|
[maven-release-plugin] prepare release dnet-dedup-3.0.2
|
2018-11-12 12:46:42 +01:00 |
Claudio Atzori
|
b7bc7f0401
|
getting rid of spark libs from dnet-pace-core
|
2018-11-12 12:46:06 +01:00 |
Claudio Atzori
|
3dacba37ea
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 11:40:42 +01:00 |
Claudio Atzori
|
8cc2517f5d
|
[maven-release-plugin] prepare release dnet-dedup-3.0.1
|
2018-11-12 11:40:34 +01:00 |
Claudio Atzori
|
851ae5eec3
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.1
|
2018-11-12 11:39:07 +01:00 |
Claudio Atzori
|
f283d58a6e
|
[maven-release-plugin] prepare release dnet-dedup-3.0.1
|
2018-11-12 11:38:52 +01:00 |
Claudio Atzori
|
6d09041288
|
[maven-release-plugin] rollback the release of dnet-dedup-3.0.1
|
2018-11-12 11:28:28 +01:00 |
Claudio Atzori
|
46cee13596
|
[maven-release-plugin] prepare for next development iteration
|
2018-11-12 11:24:06 +01:00 |
Claudio Atzori
|
e1c69ad24e
|
[maven-release-plugin] prepare release dnet-dedup-3.0.1
|
2018-11-12 11:23:57 +01:00 |
Michele De Bonis
|
b247a86e69
|
configuration files changed: dedupRun instead of run, assertion updated in tests
|
2018-11-06 11:02:00 +01:00 |
Michele De Bonis
|
4c8485d0bb
|
deleted useless imports
|
2018-11-06 09:48:22 +01:00 |
Michele De Bonis
|
748189af10
|
implementation of JaroWinklerNormalizedName, addition of various stopwords in different languages and configuration test
|
2018-11-05 17:22:59 +01:00 |
Claudio Atzori
|
e296f7a81c
|
added DiffPatchMatch utility. Resumed commented tests!
|
2018-10-31 10:49:11 +01:00 |
Michele De Bonis
|
dc41b76643
|
serialization test added. useless getter methods ignored by json serialization
|
2018-10-29 16:16:11 +01:00 |
Michele De Bonis
|
ea36007d1f
|
DedupConf parsed using Jackson library
|
2018-10-29 11:13:55 +01:00 |
Michele De Bonis
|
8b4762bf54
|
implementation of the toString methonds changed: from Gson to Jackson
|
2018-10-26 14:55:59 +02:00 |
Michele De Bonis
|
3cf3dc1934
|
modification in the initialization of clustering functions, distance algos and conditions.
|
2018-10-25 15:15:40 +02:00 |
Michele De Bonis
|
1cbbc3f15a
|
update in the discovery of clustering, conditions and distance functions (annotated with custom annotations)
|
2018-10-24 12:09:41 +02:00 |
Claudio Atzori
|
4d379c2227
|
revised PidMatch implementation, cleanup
|
2018-10-20 08:38:19 +02:00 |
Claudio Atzori
|
3197f26691
|
[maven-release-plugin] prepare for next development iteration
|
2018-10-18 12:17:34 +02:00 |