dnet-hadoop/dhp-pace-core/src/main/java/eu/dnetlib/pace/tree
Giambattista Bloisi e64c2854a3 Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface
JsonPath cache contention fixed by using a ConcurrentHashMap
Blacklist filtering performance improvement
Minor performance improvements when evaluating similarity
Sorting in clustered elements is deterministic (by ordering and identity field, instead of ordering field only)
2023-07-24 15:36:24 +02:00
..
support Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
AlwaysMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
AuthorsMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
CityMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
CosineSimilarity.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
DoiExactMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
DomainExactMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
ExactMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
ExactMatchIgnoreCase.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
InstanceTypeMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
JaroWinkler.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
JaroWinklerNormalizedName.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
JaroWinklerTitle.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
JsonListMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
KeywordMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
Level2JaroWinkler.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
Level2JaroWinklerTitle.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
Level2Levenstein.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
Levenstein.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
LevensteinTitle.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
LevensteinTitleIgnoreVersion.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
ListContainsMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
MustBeDifferent.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
NullDistanceAlgo.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
NumbersComparator.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
NumbersMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
RomansMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
SizeMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
SortedJaroWinkler.java New sources formatted by maven plugin 2023-07-06 10:28:53 +02:00
SortedLevel2JaroWinkler.java New sources formatted by maven plugin 2023-07-06 10:28:53 +02:00
StringContainsMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
StringListMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
SubStringLevenstein.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
TitleVersionMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
UrlMatcher.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00
YearMatch.java Refactor Dedup process to use Spark Dataframe API and intermediate representation with Row interface 2023-07-24 15:36:24 +02:00