Searching for Names with a 'sounds like' routine

Tony Marston - 24th March 2001

Have you ever had to search for a name on a database without knowing the exact spelling? Tricky, isn't it? This problem was solved many years ago with the creation of a 'sounds like' routine which takes a character string and converts it into something known as a SOUNDEX KEY. In essence this takes the sounds of certain characters and assigns them a number, with similar sounds having the same number. Thus a search on 'MARSTON' will include 'MARSDON' and 'MARSDEN' in the result.

The format of the Soundex Key is 'Xnnn' where:

The rules for converting characters into numbers are as follows:

I had a version of this routine added to my COBOL development environment way back in 1989, but here it is converted for Uniface:

entry SOUNDEX
params
   string  pi_Name         : IN
   string  po_SoundexKey   : OUT
endparams
variables
   string  lv_LookUp, lv_Char
   numeric lv_Num, lv_PrevNum
endvariables

; establish list of letters and corresponding numbers
; (those letters not in the list do not have numbers)
lv_LookUp = "B=1;F=1;P=1;V=1;C=2;G=2;J=2;K=2;Q=2;S=2;X=2;Z=2;D=3;T=3;L=4;M=5;N=5;R=6"

uppercase pi_Name,pi_Name              ; must be uppercase

po_SoundexKey = pi_Name[1:1]           ; move first character
pi_Name = pi_Name[2]                   ; drop first character

while (pi_Name != "")                  ; until all chars have been examined
   lv_Char = pi_Name[1:1]              ; extract next character
   pi_Name = pi_Name[2]                ; drop it from input string
   ; convert this character (if it is in the list) into a number
   getitem/id lv_Num, lv_Lookup,lv_Char
   if ($status > 0)                    ; character found
       if (lv_Num != lv_PrevNum)       ; ignore if same as previous number
           po_SoundexKey = "%%po_SoundexKey%%lv_Num"   ; append to output
           lv_PrevNum    = lv_Num                      ; save number
       endif
       length po_SoundexKey
       if ($result = 4) break          ; stop here
   endif
endwhile

while ($result < 4)
   po_SoundexKey = "%%po_SoundexKey%%%0"   ; pad with zeros until length = 4
   length po_SoundexKey
endwhile

return(0)

end SOUNDEX

The best way to use this routine is to include the soundex key on the database along with the name and make it an index. Not only is this shorter than the name string, but it allows the name to be stored in a mixture of upper and lower case, which is not usual for an indexed field. The search form must then be modified to convert the user's input into a soundex key so that the search can be performed on this key and not the original name.


Tony Marston
24th March 2001

mailto:tony@marston-home.demon.co.uk
mailto:TonyMarston@hotmail.com
http://www.tonymarston.net

counter