Soundex vs. Levenshtein


In fact these are not real alternatives. Soundex describes a scheme to match words that are similar with respect to phonemes, whereas Levenshtein is a distance measure between two strings.

The application of Soundex consists of 2 steps, the encoding of the string and an exact match of these codes.


This method divides all strings into equivalent classes, i.e. in the disjoint sets of words. Therefore every word can be found only in same class, but no word which is in a "close" class. Exactly to say: the Soundex has no concept of "proximity".


Thereagainst the Levenshtein algorithm allows to estimate an identity between two words.
Of course the calculating costs are higher than by Soundex method. But for the acceleration a Hashtable is generated.

Today only the company Exorbyte managed a hochperformante implementing of the Levenshtein algorithm . Exorbytes software allows the Levenshtein distance with millions of words within milliseconds zu calculate. This allows even for interactive search applications based on Ajax.