934 B
934 B
sidebar_position |
---|
3 |
Clustering functions
TODO
NgramPairs
It produces a list of concatenations of a pair of ngrams generated from different words.
Example:
Input string: “Search for the Standard Model Higgs Boson”
Parameters: ngram length = 3
List of ngrams: “sea”
, “sta”
, “mod”
, “hig”
Ngram pairs: “seasta”
, “stamod”
, “modhig”
SuffixPrefix
It produces ngrams pairs in a particular way: it concatenates the suffix of a string with the prefix of the next in the input string.
Example:
Input string: “Search for the Standard Model Higgs Boson”
Parameters: suffix and prefix length = 3
Output list: “ardmod”
(suffix of the word “Standard”
+ prefix of the word “Model”
), “rchsta”
(suffix of the word “Search”
+ prefix of the word “Standard”
)