Commit Graph

104 Commits

Author SHA1 Message Date
Giambattista Bloisi d2d173773e Precompile blacklists patterns before evaluating clustering criteria
Enable Junit 5 tests in maven builds
Make path comparisons platform-independent
Read String resource files assuming they are encoded in UTF-8
Fix a few test conditions
2023-06-16 09:41:11 +02:00
Michele De Bonis 7e2e7dcdcd implementation of the support for authors deduplication: cosinesimilarity comparator and double array json parser 2023-04-17 11:06:27 +02:00
Michele De Bonis b5584f084a minor change in the author match which now can compute count and percentage 2023-04-04 17:10:37 +02:00
Michele De Bonis 66472ce408 implementation of author dedup configuration and lnfi clustering function 2023-01-31 11:53:10 +01:00
Michele De Bonis 00466512ea implementation of the new software configuration 2022-11-22 17:48:34 +01:00
Michele De Bonis 42cff050e7 minor changes 2022-11-21 14:35:46 +01:00
miconis 6c47fb0e67 implementation of comparators and clustering function for the author deduplication 2022-04-19 10:18:09 +02:00
miconis 9618e889bd test implementation for the new fdup version 2022-04-13 09:48:56 +02:00
miconis b2cbc09fda bug fix in the normalization of a legalname, city map updated and transliteration support added 2022-03-15 14:59:13 +01:00
miconis cb72ce0a22 bug fix in the AuthorMatch, implementation of the concat function in the model creation with jpath query 2022-03-09 12:53:09 +01:00
miconis 613b32fcf7 implementation of the size threshold on authors list match 2022-03-08 16:49:28 +01:00
miconis de66199001 minor change 2022-01-13 17:20:20 +01:00
miconis e168d95ec0 bug fix in the authormatch comparator, implementation of tests 2022-01-13 11:58:28 +01:00
miconis 5e8757a457 implementation of new comparators for publication dedup configuration update 2021-12-27 17:35:02 +01:00
miconis 451114418d implementation of the instance type comparator and its tests 2021-11-04 15:20:57 +01:00
miconis 5a52aed8e1 dedup test implementation & graph drawing tools 2021-09-13 14:53:19 +02:00
miconis 67a0f965e7 minor change: version updated 2021-05-03 16:05:39 +02:00
miconis fad803bd46 implementation of cross comparison for different fields, addition of clustering mechanism to collapse keys from different clustering functions on the same cluster 2021-05-03 15:37:41 +02:00
miconis e65526848a implementation of the wf to dedup entities, addition of the module to run the wf on the cluster 2020-12-04 15:41:31 +01:00
miconis 5021e5048f fixed error in the treeprocessor. it used th=-1 as default value, now it use th=1 2020-09-29 12:01:25 +02:00
miconis 9e8ea8f6ee fixed error in the block processor: entities with orderField=null were not considered 2020-09-19 17:43:41 +02:00
Sandro La Bruzzo eea8e87b25 fixed NPE 2020-08-06 10:27:05 +02:00
miconis 7188648bdc implementation of the clustering function for the suffixprefix chain 2020-07-16 18:57:55 +02:00
Claudio Atzori 20848c1c6e reverted to 4.0.3-SNAPSHOT 2020-07-15 17:37:36 +02:00
Claudio Atzori 055adbb56d Revert "wordssuffixprefix: adjust the token length according to the number of words; removed maven release temporary files"
This reverts commit ecebdff026.
2020-07-15 17:35:56 +02:00
Claudio Atzori ecebdff026 wordssuffixprefix: adjust the token length according to the number of words; removed maven release temporary files 2020-07-15 17:13:45 +02:00
Claudio Atzori 1262f3dd8e Revert "wordssuffixprefix: adjust the token length according to the number of words; removed maven release temporary files"
This reverts commit b46be9c8ae.
2020-07-15 17:11:46 +02:00
Claudio Atzori b46be9c8ae wordssuffixprefix: adjust the token length according to the number of words; removed maven release temporary files 2020-07-15 16:49:47 +02:00
miconis 33eadb7c9c implemented new function for clustering 2020-07-02 17:04:17 +02:00
miconis 7bc00a3f5f implementation of the mechanism to truncate the string and the lists 2020-04-24 14:36:42 +02:00
Sandro La Bruzzo b6c4f4acf3 upgraded maven version of commons-lang 2020-02-10 12:38:40 +01:00
miconis eeeb374480 minor changes in comparators 2020-01-24 10:01:11 +01:00
miconis 6a27fb14a8 update in the implementation of the tree: addition of new logic aggregations and statistics 2020-01-14 11:42:43 +02:00
miconis 43404db44f minor changes 2019-12-18 16:20:35 +01:00
miconis 72ca3bb9ba implementation of new aggregation in the tree node processing 2019-12-18 16:19:36 +01:00
miconis 4af490221b implementation of new aggregation in the tree node processing 2019-12-18 16:19:26 +01:00
Sandro La Bruzzo 492049b8bc fixed wrong use of jspath 2019-12-18 09:29:44 +01:00
miconis 159cb2a493 implementation of new json comparator and update of the publication configuration 2019-12-17 09:16:26 +01:00
Sandro La Bruzzo d09193a094 merged JqMapping branch into tree2 2019-12-13 11:30:02 +01:00
Sandro La Bruzzo 42ffbec061 fix stuff 2019-12-06 15:28:30 +01:00
Sandro La Bruzzo bd79999fb8 Improved deduplication 2019-12-05 14:14:25 +01:00
miconis 5676e625bd implementation of romansmatch and re-implementation of the getNumber function. New terms in the translation map and update of the configuration 2019-11-28 16:54:44 +01:00
miconis 493b385b5b addition of one term to the translation maps in the configurations 2019-11-27 15:48:37 +01:00
miconis c72f48fb33 minor change in the citymatch 2019-11-21 10:54:02 +01:00
miconis 40808200f0 the param map has been updated: now it accepts string parameters 2019-11-21 09:37:56 +01:00
miconis 79e62787cf jarowinklernormalizedname splitted in 3 different comparators: citymatch, keywordmatch and jarowinkler. Implementation of the TreeStatistic support functions 2019-11-20 10:45:00 +01:00
miconis 676e9c8e37 code cleaning and implementation of the TreeDedup + minor changes 2019-11-14 10:01:21 +01:00
miconis 5b3adb3e65 code cleaning, distribution of the classes in packages and implementation of the new configuration 2019-11-07 12:47:12 +01:00
miconis 3ff5be675b put the last modification of the master branch into the tree2. Addition of the configuration as parameter of the comparator. This is to allow the comparator to access it 2019-10-29 16:38:42 +01:00
miconis 8564fdd19c minor changes 2019-10-29 15:58:21 +01:00