Distance between two strings
An answer to this question on Stack Overflow.
Question
I don’t believe the standard library provides anything to compute the distance between two strings and I can’t seem to find anything in Boost StringAlgo. So, is there any other library I could use?
I am not too picky about the algorithm. Jaro-Winkler is fine, Levenshtein too, and I am open to suggestions, I don’t want to code something that someone has already coded..
Answer
You could try SimString.
SimString is a simple library for fast approximate string retrieval. Approximate string retrieval finds strings in a database whose similarity with a query string is no smaller than a threshold. Finding not only identical but similar strings, approximate string retrieval has various applications including spelling correction, flexible dictionary matching, duplicate detection, and record linkage.
SimString supports cosine, Jaccard, dice, and overlap coefficients as similarity measures. SimString uses letter n-grams as features for computing string similarity.
Or the SimMetric library.
SimMetrics is a Similarity Metric Library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro etc) to other metrics, (e.g Soundex, Chapman). Work provided by UK Sheffield University funded by (AKT) an IRC sponsored by EPSRC, grant number GR/N15764/01.
Or the libdistance library, which has implementations of the Levenshtein, Dameru, Needleman-Wunsch, Hamming, Bloom Filter, Jaccard, and Minkowski distances.
Phonetic algorithms may also be of interest.