2007. április 19., csütörtök

Parsing the Words in a Sentencee


Problem/Question/Abstract:

How can I parse the words in a sentence?

Answer:

This week's tip is some code that actually accomplishes something very simple: parsing the words of a sentence. I've been hanging out in the newsgroups and in CompuServe forum and ran across several question regarding what's the best way to do this, so I came up with a simple procedure to do it. I've seen a lot of people use arrays and such, but the problem with using arrays is that they're of fixed size (though in a previous tip, I showed how to make runtime resizeable arrays). A better way to store the words of a string is to use a TStringList object.

A TStringList is essentially an array of strings (or objects) that can be resized dynamically at runtime. Since memory is allocated and deallocated in the background, you don't have to worry about those operations when using one. All you have to worry about is adding or deleting elements. Each item in a TStringList is referenced by its Strings property, much in the way you reference an array element. Let's say you want to know what the value of the fifth element in a string list. You'd write something like the following:

x := MyStringList.Strings[4];

I forgot to mention that TStringLists are zero-based, so the first element in the TStringList is always numbered '0.' So how can you use it to parse a sentence? Well, let's look at the code below:

function FillList(sentnc: string; {Input string}
                                                                  var sList: TStringList; {String List to add values to}
                                                                   clearList: Boolean) {Clear list before adding?}
                                                                 : Boolean; {Return value}
var
  str, wrd: string;
  I: Word;
begin

  {Initialize vars}
  Result := True;
  str := sentnc;
  wrd := '';

  {Check to see if the string passed is blank}
  if (Length(sentnc) = 0) then
  begin
    MessageDlg('Passed an empty string', mtError, [mbOk], 0);
    Result := False;
    Exit;
  end;

  {Clear the list if wanted and the count of values is > 0}
  if clearList and (sList.Count > 0) then
    repeat
      sList.Delete(0);
    until
      sList.Count = 0;

  while (Pos(' ', str) > 0) do {Do this while you find}
  begin {spaces in the sentence}
    wrd := Copy(str, 1, Pos(' ', str) - 1); {Get the word from the string}
    sList.Add(wrd); {Add the word to the TStringList}
    str := Copy(str, Pos(' ', str) + 1, {Redefine the sentence by cutting}
      Length(str) - Length(wrd) + 1); {off the first word}
  end;

  if (Length(str) > 0) then {This is important, because you never}
    sList.Add(str); {know if there's anything left in the sentence.}
end;

The function above takes a string input called sentnc and uses the Pos and Copy functions to successively cut off the first word of the phrase and load it into a string list. You'll notice that I've added a couple of tests: 1) to test whether the input is blank; 2) to see if the program should empty the list before adding items to the list. You'll also notice that I have the TStringList object passed by reference as a formal parameter of the function. This is so that any string list can be passed into the function to accept a phrase. However, besides the extra checking stuff, the real workhorse of the function is the while loop. Follow the commenting to the right of the code to see what's going on.

To employ this function, you'd have to create a TStringList object then call the function. Look at the code below:

procedure TForm1.FormCreate(Sender: TObject);
begin
  strList := TStringList.Create;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  I: Integer;
begin
  {Fill the list}
  if FillList(Edit1.Text, strList, True) then
  begin
    repeat
      ListBox1.Items.Delete(0);
    until
      ListBox1.Items.Count = 0;

    for I := 0 to strList.Count - 1 do
      ListBox1.Items.Add(strList.Strings[I]);
  end;
end;

procedure TForm1.FormClose(Sender: TObject; var Action: TCloseAction);
begin
  strList.Free;
end;

The code above was taken from a form I built to test the FillList function. In the FormCreate, I create and initialize the TStringList object. In a pushbutton click event, I read the contents of a TEdit then call the function. The resultant load is then read into a list box that I dropped on the form. FormClose destroys the TStringList. Granted, this is a rather simple way of employing the string list, but there are numerous ways in which to use this neat little object.

Nincsenek megjegyzések:

Megjegyzés küldése