Giambattista Bloisi
b0ade43608
Precompile blacklists patterns before evaluating clustering criteria
...
Enable Junit 5 tests in maven builds
Make path comparisons platform-independent
Read String resource files assuming they are encoded in UTF-8
Fix a few test conditions
2023-06-16 09:41:11 +02:00
Michele De Bonis
cb595c87bb
implementation of the support for authors deduplication: cosinesimilarity comparator and double array json parser
2023-04-17 11:06:27 +02:00
Michele De Bonis
6a6c266dde
implementation of author dedup configuration and lnfi clustering function
2023-01-31 11:53:10 +01:00
Michele De Bonis
14f6346676
implementation of the new software configuration
2022-11-22 17:48:34 +01:00
Michele De Bonis
9fee2ed611
minor changes
2022-11-21 14:35:46 +01:00
miconis
9ddd24ba36
implementation of comparators and clustering function for the author deduplication
2022-04-19 10:18:09 +02:00
miconis
a965233dd0
bug fix in the normalization of a legalname, city map updated and transliteration support added
2022-03-15 14:59:13 +01:00
miconis
2f1ba56f61
bug fix in the authormatch comparator, implementation of tests
2022-01-13 11:58:28 +01:00
miconis
a224bf70a4
implementation of new comparators for publication dedup configuration update
2021-12-27 17:35:02 +01:00
miconis
8f1db32921
implementation of the instance type comparator and its tests
2021-11-04 15:20:57 +01:00
miconis
fbb1b66bfb
dedup test implementation & graph drawing tools
2021-09-13 14:53:19 +02:00
miconis
4bce4f2e8e
minor change: version updated
2021-05-03 16:05:39 +02:00
miconis
4988e9f80d
implementation of cross comparison for different fields, addition of clustering mechanism to collapse keys from different clustering functions on the same cluster
2021-05-03 15:37:41 +02:00
miconis
ed0d5d3e1d
implementation of the wf to dedup entities, addition of the module to run the wf on the cluster
2020-12-04 15:41:31 +01:00
miconis
07ab904d60
implementation of the clustering function for the suffixprefix chain
2020-07-16 18:57:55 +02:00
miconis
f933fd33e0
implemented new function for clustering
2020-07-02 17:04:17 +02:00
miconis
6e9b27f37d
implementation of the mechanism to truncate the string and the lists
2020-04-24 14:36:42 +02:00
miconis
5c8f6febee
minor changes in comparators
2020-01-24 10:01:11 +01:00
miconis
b3748b8d77
minor changes
2019-12-18 16:20:35 +01:00
miconis
b21b1b8f61
implementation of new aggregation in the tree node processing
2019-12-18 16:19:36 +01:00
miconis
20fcfe6328
implementation of new aggregation in the tree node processing
2019-12-18 16:19:26 +01:00
Sandro La Bruzzo
d924f28b93
fixed wrong use of jspath
2019-12-18 09:29:44 +01:00
miconis
84aaa65501
implementation of new json comparator and update of the publication configuration
2019-12-17 09:16:26 +01:00
Sandro La Bruzzo
5c01ae4c92
merged JqMapping branch into tree2
2019-12-13 11:30:02 +01:00
Sandro La Bruzzo
16c670a5d5
Improved deduplication
2019-12-05 14:14:25 +01:00
miconis
49f9beb4a8
implementation of romansmatch and re-implementation of the getNumber function. New terms in the translation map and update of the configuration
2019-11-28 16:54:44 +01:00
miconis
f791730330
addition of one term to the translation maps in the configurations
2019-11-27 15:48:37 +01:00
miconis
8c0d346005
the param map has been updated: now it accepts string parameters
2019-11-21 09:37:56 +01:00
miconis
ddd40540aa
jarowinklernormalizedname splitted in 3 different comparators: citymatch, keywordmatch and jarowinkler. Implementation of the TreeStatistic support functions
2019-11-20 10:45:00 +01:00
miconis
0973899865
code cleaning, distribution of the classes in packages and implementation of the new configuration
2019-11-07 12:47:12 +01:00
miconis
30a873265f
put the last modification of the master branch into the tree2. Addition of the configuration as parameter of the comparator. This is to allow the comparator to access it
2019-10-29 16:38:42 +01:00
miconis
5f249fd56c
minor changes
2019-10-23 16:37:20 +02:00
miconis
c9863debfa
minor changes and configuration updates (synonym field added)
2019-10-23 16:31:45 +02:00
miconis
50b7a12b3f
normalization of the term in the translation map added
2019-10-08 15:13:45 +02:00
miconis
26b383fea2
translation map moved in json configuration, support for synonyms added in the configuration, now the configuration is argument of conditions, distancealgos and clusteringfunctions
2019-10-08 14:53:52 +02:00
Claudio Atzori
74c6462b49
updated translation map and some tests
2019-09-25 10:15:13 +02:00
miconis
d71dae5fd2
implementation of the conditions in tree nodes. get rid of the conditions part of the configuration
2019-08-09 15:41:49 +02:00
miconis
a5c5d2f01b
implementation of the decision tree. It takes place of the distance algos, necessaryConditions and sufficientConditions are still there. The model contains only path, type and name of the field. ignoreMissing is still in the model because it is used by the conditions.
2019-08-09 10:08:34 +02:00
miconis
8c867101ef
addition of a fixSpecial function to address the problem with special character in organization names, addition of new terms in translation maps
2019-08-06 17:06:05 +02:00
miconis
4502b44337
addition of the BlockUtils class for meta-blocking, implementation of a new local test with edge filtering example
2019-08-06 12:09:34 +02:00
miconis
a85576c27e
restyling of the JaroWinklerNormalizedName comparator, now it is optimized. Addition of some translations in the translation maps, addition of a clustering based on keywords in organizations legalnames
2019-07-19 17:10:29 +02:00
miconis
3c6f8d1e44
bug fixing in the keywordsclustering class
2019-07-08 11:01:49 +02:00
miconis
15bec5e876
addition of doi normalization in PidMatch comparator, addition of keywordsclustering (clustering based on terms in the translation maps for the organizations), minor changes
2019-07-08 09:44:02 +02:00
miconis
54e4d0af04
exact match condition gives undefined if a field is missing, ignoremissing semantics changed: now performs the comparison in any case if =true, if false gives -1 in case of missing
2019-06-18 14:05:31 +02:00
miconis
7e7018c51f
addition of a sparktester test, implementation of 2 different classes for testing in dnet-dedup-test module, addition of new terms in the vocabulary and change in the implementation of the JaroWinklerNormalizedName comparator
2019-04-03 09:40:14 +02:00
miconis
4bd5a9beee
minor changes
2019-03-26 15:48:21 +01:00
Michele De Bonis
662448e584
update of the comparator for legalnames of organizations
2019-03-21 14:27:27 +01:00
Michele De Bonis
0735f3a822
implementation of the test classes and minor changes
2019-02-08 12:56:47 +01:00
Michele De Bonis
7a8d28991f
implementation of the decision tree for the deduplication of the authors, implementation of multiple comparators to be used in a tree node and definition of the proto for person entity
2018-12-20 09:54:41 +01:00
Michele De Bonis
3d4372ced9
addition of cities check
2018-11-16 16:11:03 +01:00