2005. február 14., hétfő

Searching Strings by the way they sound


Problem/Question/Abstract:

Did you ever want to find a string - But were not sure of it's spelling? A typical case would be names (Micael/Maical/Michael/Maichael) all sound same but differ in spelling!

Answer:

Most of you may already be familiar with the magical "Soundex" function which is present in many Db environments ranging from FoxPro to Oracle/SQL Server. Few of you may wonder how it works! Well, here is the implementation of the Soundex function in Pascal based on an algorithm that I found in a computer magazine long time back. The original program worked in Turbo Pascal, but I have modified it for Delphi (The only change being use of ShortString instead of String!)

The function seems to return the same values as does SQL Server for the little tests that I conducted. However, as you will have already guessed, I provide you no gurantee that it will provide same values for all strings.

Please save the code below in a file called Soundx.pas. You will need to include the file in your source (Uses Soundx) and then you will have access to the Soundex() function.

For the example given in the Question/Problem/Abstract, Soundex returns the same value (M240) for each of Micael/Maical/Michael/Maichael

Wishing you all a "Sound" search (Ha!)

{******************************************************}
{* Description: Implementation of Soundex function    *}
{******************************************************}
{* Last Modified : 12-Nov-2000                        *}
{* Author        : Paramjeet Singh Reen               *}
{* eMail         : Paramjeet.Reen@EudoraMail.com      *}
{******************************************************}
{* This program is based on the algorithm that I had  *}
{* found in a magazine. I do not gurantee the fitness *}
{* of this program. Please use it at your own risk.   *}
{******************************************************}
{* Category :Freeware.                                *}
{******************************************************}

unit Soundx;

interface

type
  SoundexStr = string[4];

  //Returns the Soundex code for the specified string.
function Soundex(const InpStr: ShortString): SoundexStr;

implementation

const
  Alphs: array['A'..'Z'] of Char = ('0', '1', '2', '3', '0', '1', '2', '0', '0', '2',
    '2',
    '4', '5', '5', '0', '1', '2', '6', '2', '3', '0', '1',
    '0', '2', '0', '2');

function Soundex(const InpStr: ShortString): SoundexStr;
var
  vStr: ShortString;
  vCh1: Char;
  i: Word;

begin
  //Store the given InpStr in local variable in uppercase
  vStr := '';
  for i := 1 to Length(InpStr) do
    vStr := vStr + UpCase(InpStr[i]);

  //Replace all occurances of "PH" with "F"
  i := Pos('PH', vStr);
  while (i > 0) do
  begin
    Delete(vStr, i, 2);
    Insert('F', vStr, i);
    i := Pos('PH', vStr);
  end;

  //Replace all occurances of "CHR" with "CR"
  i := Pos('CHR', vStr);
  while (i > 0) do
  begin
    Delete(vStr, i, 3);
    Insert('CR', vStr, i);
    i := Pos('CHR', vStr);
  end;

  //Replace all occurances of "Z" with "S"
  for i := 1 to Length(vStr) do
    if (vStr[i] = 'Z') then
      vStr[i] := 'S';

  //Replace all occurances of "X" with "KS"
  i := Pos('X', vStr);
  while (i > 0) do
  begin
    Delete(vStr, i, 1);
    Insert('KS', vStr, i);
    i := Pos('X', vStr);
  end;

  //Remove all adjacent duplicates
  i := 2;
  while (i <= Length(vStr)) do
    if (vStr[i] = vStr[i - 1]) then
      Delete(vStr, i, 1)
    else
      Inc(i);

  //Starting from 2nd char, remove all chars mapped to '0' in Alphs table
  i := 2;
  while (i <= Length(vStr)) do
    if (Alphs[vStr[i]] = '0') then
      Delete(vStr, i, 1)
    else
      Inc(i);

  //Assemble Soundex string from Alphs table
  vCh1 := vStr[1];
  for i := 1 to Length(vStr) do
    vStr[i] := Alphs[vStr[i]];

  //Remove all adjacent duplicates from assembled Soundex string
  i := 2;
  while (i <= Length(vStr)) do
    if (vStr[i] = vStr[i - 1]) then
      Delete(vStr, i, 1)
    else
      Inc(i);

  //Final assembly of Soundex string
  vStr := vCh1 + Copy(vStr, 2, 255);
  for i := Length(vStr) to 3 do
    vStr := vStr + '0';
  Soundex := vStr;
end;

end.

Nincsenek megjegyzések:

Megjegyzés küldése