Soundex¤
This transformer plugin implements the Soundex phonetic algorithm for indexing names by their English sounds.
Description¤
This plugin is a linguistic transformer plugin. Specifically, it provides an implementation of the Soundex algorithm, also known as Soundex Indexing System.
Soundex is the simplest —and oldest— of the phonetic algorithms. This plugin includes both the original Soundex and a variation thereof known as “Refined Soundex”.
Soundex¤
The (plain) Soundex algorithm encodes or indexes an English word such as a (sur)name into the pattern
initial letter + three digits, where the three digits represent given consonant groups. This mapping is the
following:
b, f, p, v → 1c, g, j, k, q, s, x, z → 2d, t → 3l → 4m, n → 5r → 6
Notice that, in order to use the classical Soundex algorithm, the plugin parameter refined needs to be set to false.
Refined Soundex¤
The Refined Soundex algorithm is an improvement of the Soundex algorithm, without the limitations of dropping the
vocals and restricting the output to a four-digit encoding of the input. This variation of the Soundex algorithm can be
used by setting the plugin parameter refined to true (default). Its mapping is the following:
a, e, h, i, o, u, w, y → 0b, p → 1f, v → 2c, k, s → 3g, j → 4q, x, z → 5d, t → 6l → 7m, n → 8r → 9
Examples¤
Soundex¤
We can get an idea of the output of the Soundex algorithm using an online Soundex Converter such as https://www.mainegenealogy.net/soundex_converter.asp.
robertandrupertlead to the same Soundex index:r163.eulerleads toe460,gaussisg200andhilbertcorresponds toh416.
Refined Soundex¤
brazandbrozlead to the same Refined Soundex index:b1905.caren,carren,coram,corran,curreenandcurwenare all encoded withc30908.hairs,hark,hars,hayers,heersandhiersare all mapped toh093.- All sorts of variations of
lambard, such aslambart,lambert,lambirdorlampaert, lead tol7081096.
Related plugins¤
Other phonetic algorithms usually associated with Soundex are the variations or improvements implemented by the
NYSIIS
and Metaphone algorithms. The corresponding linguistic transformer plugins
are named accordingly.
Parameter¤
Refined¤
No description
- ID:
refined - Datatype:
boolean - Default Value:
true
Advanced Parameter¤
None