Delphi Knowledge Base: július 2004

2004. július 30., péntek

WordBasic via OLE

Problem/Question/Abstract:

WordBasic via OLE

Answer:

Try the following:

uses OleAuto;

var
  MSWord: Variant;

begin
  MsWord := CreateOleObject('Word.Basic');
  MsWord.FileNewDefault;
  MsWord.TogglePortrait;
end;

2004. július 29., csütörtök

Waiting for Threads

Problem/Question/Abstract:

I've created an application that spawns a couple of threads at once when a user presses a button. The threads execute fairly quickly, but I don't want the user to move on to the next task until the threads are absolutely finished. I could move the thread code back into the main unit, but that defeats the whole purpose of unlocking my user interface while the processes take place. Is there a good way to do this?

Answer:

In past articles regarding threads, I've discussed Critical Sections and mutexes as ways of protecting shared resources, and having them wait until a resource is freed. But how do you wait for a thread or threads to finish from the user interface?

There are a couple of ways to approach this. First, you create a global Boolean variable and use it as a flag. You set its value when the thread starts, then set its value when the thread ends. Meanwhile, you have can set up a looping mechanism in the main unit to periodically check the completion status of the thread and use the Application.ProcessMessages call to keep your application alive. Here's some code:

unit wthread;

interface

uses
  Classes, Windows;

type
  TTestThr = class(TThread)
  private
    { Private declarations }
  protected
    procedure Execute; override;
  end;

implementation
uses Main;
{ TTestThr }

procedure TTestThr.Execute;
var
  I: Integer;
begin
  for I := 0 to 39 do
    Sleep(500);
  ImDone := True;
end;

end.

The code above is the thread code. All it does is perform a for loop and wait for half a second in between increments. Pretty basic stuff. Here's the OnClick method for a button on the main form that starts the thread:

procedure TForm1.Button1Click(Sender: TObject);
begin
  ImDone := False;
  Label1.Caption := 'Waiting';
  TTestThr.Create(False);
  while not ImDone do
    Application.ProcessMessages;
  Label1.Caption := 'Not Waiting';
end;

The var ImDone is a Boolean flag that gets set as soon as the button is pressed. As you can see, the while loop in the OnClick handler just performs a yield with Application.ProcessMessages to process user input. Pretty simple right? Okay, now, let me tell you one very important thing:

Disregard everything that I just wrote! It's the wrong way to do it!

Why? For one thing, it's totally inefficient. While Application.ProcessMessages allows a program to continue to receive and process messages it receives, it still eats up CPU time because all it's doing is yielding temporarily, then going back to its original state, meaning it constantly activates and deactives. Yikes! Not good. With a thread like the TTestThr that I just wrote, you won't notice much of a difference in performance by using this methodology. But that doesn't mean that it's right. Furthermore, if you're a real stickler for good structured programming methodologies, you should avoid global variables like the plague! But that's not even the worst of it. There's a real big catch here...

What do you do in the case of having to wait for multiple threads to finish? In the model described above, you would have to create a Boolean flag for EVERY thread that you create! Now that's bad.

The best way to handle this condition is to create what might be called a wait thread - an intermediary thread that will perform the waiting. In combination with a couple of Windows API calls, you can achieve a very efficient thread waiting process at little cost to CPU time and system resources.

Those couple of Windows API functions are called WaitForSingleObject and WaitForMultipleObjects. These two functions are used to wait for objects to enter a signaled state before they return. Okay, what's signaled mean anyway? This one took me awhile to understand, but for you, I'll be as clear as possible so you can understand it much quicker than I did. Essentially, when an object is created in Windows, it is given a system assigned state property of sorts. While it is active or in use, it is said to be non-signaled. When it is available, it is said to be signaled. I know, it seems kind of backwards. But that's it in a nutshell. Anyway, with respect to the functions above, they are designed to wait for an object or objects to enter a signaled state; that is, wait for the objects to become available to the system again.

The advantage of these two functions with respect to waiting for threads to finish is that they enter into an efficient sleep state that consumes very little CPU time. Contrast that with the Application.ProcessMessages methodology which has the potential for consuming CPU cycles, and you know why they'd be the choice make if you're going to wait for threads.

WaitForSingleObject takes two parameters, a THandle and a timeout value in milliseconds. If you're going to wait for a single thread, all you need to do is pass the thread's handle and the time amount of time to wait for the object and it'll do the waiting for you. I should mention that there's a special system constant called INFINITE that will make the function wait indefinitely. Typically, you'll use this constant as opposed to setting a specific time. But that also depends on your process. Here's an example:

WaitForSingleObject(MyThread.Handle, INFINITE);

On the other hand, WaitForMultipleObjects is a bit more complex. It takes for arguments for parameters. They're described in the table below (don't worry about them too much right now, we'll discuss them below):

Argument
Type
Description
cObject
DWORD
Number of handles in the object handle array
lphObjects
Pointer
Address of object handle array
fWaitAll
Boolean
True indicates that the function waits until all objects are signaled
dwTimeOut
DWORD
Number of milliseconds to wait (can be INFINITE)

What's this about a handle array? Well, in order for the function to track the states of all objects to be waited for, their handles have to be in an array. This is simple to create: just declare an array of THandle and set each element's value to a thread's handle. No big deal. But I think it's probably best to put all this perspective with some code that we can discuss. Here's the entire unit code that contains both the wait thread's declaration and the TTestThr declaration:

unit wthread;

interface

uses
  Classes, Windows;

//"Worker thread declaration"
type
  TTestThr = class(TThread)
  private
    { Private declarations }
  protected
    procedure Execute; override;
  end;

  //Wait thread declaration
type
  TWaitThr = class(TThread)
  private
    procedure UpdateLabel; //this just sets the label's caption
  protected
    procedure Execute; override;
  end;

implementation
uses Main;
{ TTestThr }

procedure TTestThr.Execute;
var
  I: Integer;
begin
  FreeOnTerminate := True;
  for I := 0 to 39 do
    Sleep(500);
  ImDone := True;
end;

procedure TWaitThr.UpdateLabel;
begin
  Form1.Label1.Caption := 'Not Waiting';
end;

procedure TWaitThr.Execute;
var
  hndlArr: array[0..4] of THandle;
  thrArr: array[0..4] of TTestThr;
  I: Integer;
begin
  FreeOnTerminate := True;
  for I := 0 to 4 do
  begin
    thrArr[I] := TTestThr.Create(False);
    hndlArr[I] := thrArr[I].Handle;
    Sleep(1000); //stagger creation of the threads
  end;
  WaitForMultipleObjects(5, @thrArr, True, INFINITE);
  Synchronize(UpdateLabel);
end;

end.

I put the Execute method for the TWaitThr in boldface so you can focus in on it. Notice that to fill in the hndlArr elements, I use a simple for loop. To make matters even simpler, I just declared an array of TTestThr to I could create the threads and immediately assign their respective handles to the handle array. The most important line in the code is:

WaitForMultipleObjects(5, @thrArr, True, INFINITE);

which is the call to WaitForMultipleObjects. Notice too that I pass the handle array's address to the function, as that is what the function calls for. Once the call is made, it won't allow the Execute method to continue until all the threads enter a signaled state. Once it's done waiting, UpdateLabel is called to change the text of a label on my main form. Here's the entire code listing for the main form. All it has on it are a TLabel and two TButtons.

unit main;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls;

type
  TForm1 = class(TForm)
    Label1: TLabel;
    Button1: TButton;
    Button2: TButton;
    procedure Button1Click(Sender: TObject);
    procedure Button2Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;
  ImDone: Boolean;

implementation
uses wthread;
{$R *.DFM}

procedure TForm1.Button1Click(Sender: TObject);
begin
  ImDone := False;
  Label1.Caption := 'Waiting';
  TTestThr.Create(False);
  while not ImDone do
    Application.ProcessMessages;
  Label1.Caption := 'Not Waiting';
end;

procedure TForm1.Button2Click(Sender: TObject);
begin
  Label1.Caption := 'Waiting';
  TWaitThr.Create(False);
end;

end.

So why go to all this trouble? Why not move the WaitForMultipleObjects code to the main form? The reason for this is simple. Since both WaitForSingleObject and WaitForMultipleObjects don't return until the object(s) they're waiting for enter a signaled state, the main form would essentially become locked and unavailable until the function returns. Kind of defeats the whole purpose of writing multi- threaded programs don't you think?

So here's another thing that you can add to your arsenal of multi-threading techniques...

2004. július 28., szerda

Using the LocalSQL SubString() Function

Problem/Question/Abstract:

Is there a SQL equivalent to Delphi's Pos() function?

Answer:

You bet. It's called the SUBSTRING function. Like the Pos function, the SUBSTRING function in SQL will return a substring of a string based upon a range of characters you specify. It's a handy function to have because not only can it be used within the WHERE portion of a SQL statement to search a column based on a substring, it can also be used in the SELECT portion of the SQL statement to return a substring of a column.

Here's syntax for the SUBSTRING function:

SUBSTRING( FROM Starting Position FOR

Substring Length)

Here are definitions of the various values:

FieldName
This is the name of the column in your table that you will apply the SUBSTRING function to
Starting Position
This is the starting position of the Column's field value. For instance, if you want to start at the second character, the value here would be '2.'
SubString Length
This is the length of the Substring itself. It can be any value greater than 0.

To see how SUBSTRING can be employed in the SELECT and WHERE clauses, let's look at a couple of examples. First, let's see how we can use the SUBSTRING function to search a column based on a substring of that column.

Let's say I want to search a customer database for all names beginning with 'DEL' in the LastName field of my database. Here's some simple SQL that will accomplish that:

SELECT * FROM "CUSTOMER"

WHERE SUBSTRING(LastName FROM 1 FOR 3) = 'DEL'

This SQL statement will return all rows that start with 'DEL.'

The SUBSTRING Function's Secret Power

Now here's where I think the SUBSTRING function really shines. I have found that in many cases, I'm not interested in extracting the entire value of a particular field. For example, I work in health care analysis (specifically drug benefits). In our claims database, drugs are assigned specific identification numbers in string format, called an NDC. The identifiers are interesting in that they are hierarchical in nature. For example, the identifier is an 11-digit string. The first two characters of the string represent the drug manufacturer; the the first nine digits represent the manufacturer, brand, and drug classification. The full string gives all the information from the previous examples, plus the strength and dosage administered.

When I'm called upon to perform drug analysis, my users typically aren't interested in the strength and dosage of the drugs, so they request that I only include the nine-digit drug classification level in my analysis. For instance, they may request the costs associated with all drug classifications. This is easily accomplished with the following SQL statement:

SELECT D."Drug Cost", D."Amount Due", SUBSTRING(NDC FROM 1 FOR 9) AS NDC9DIGIT

FROM ":Customer:CLAIMS.DB" D

WHERE (D."Fill Date" >= '1/1/96')

Note: We're assuming the destination table to be :PRIV:Answer.db

Since the query above will create duplicate values in the NDC column and we want distinct NDCs reported, we do one more query to summarize the cost and amount due columns and aggregate them on the distinct NDCs.

SELECT DISTINCT NDC9DIGIT,

SUM(D."Drug Cost") AS D."Drug Cost",

SUM(D."Amount Due") AS D."Amount Due"

FROM ":PRIV:Answer.DB"

ORDER BY NDC9DIGIT

This query's answer table will now have the cost and amount due values rolled up to the distinct NDCs.

SUBSTRING can add a lot to your application by providing a means to look at your data in a lot of different ways. Especially where the column values you are applying SUBSTRING to are hierarchical or categorical in nature, SUBSTRING will prove to be an indispensable function.

One thing to note: Many server databases don't support the SUBSTRING function. In most cases, you have to use the LIKE operator to simulate SUBSTRING's functionality. In other cases, they have their own proprietary functions to handle substrings. You should check your server databases's documentation to see what the equivalent would be.

2004. július 27., kedd

Using SQL2 Built-in Functions

Problem/Question/Abstract:

How do I go about using SQL2 built-in functions

Answer:

A problem that's rampant with many software tools today, even one as complete and comprehensive as Delphi, is that documentation on many important subjects is either incomplete, difficult to locate or, altogether missing.

SQL2 built-in functions fall into the final category. But while Delphi lacks the documentation of these topics, a lot of books are missing the topics as well! I must have pored over 10 SQL reference books before I found anything discussing the built-in functions in any detail, and still what I found was incomplete. But I don't blame any specific party for the lack of documentation on these subjects. And from my estimation, there's a good reason why you won't find much material on them, and it has a lot to do with how standards are established.

Establishing standards in any industry is an evolutionary process. As soon as a standard is put in place, some company comes up with ways to extend and enhance the standard. More companies join the fray, and then a new standard is established that incorporates the most commonly shared features of the various companies' products into the standard. The process then repeats itself.

Look at HTML! Soon after HTML 2.0 was introduced by the W3 Consortium, Netscape came along and added a bunch of proprietary features such as tables and backgrounds, which are now part of HTML 3.0. And while you can now find pretty good documentation on the standard tag set for HTML 3.0, for a while decent documentation was pretty scarce. Now the W3 Consortium is furiously working on Cascading Style Sheets to accomodate the various disparate document publishing techniques employed by the different browser vendors. Here we go again...

Going back to the subject of SQL2 built-in functions, I believe they have followed a path similar to HTML. SQL89 (SQL1) was devoid of built-in functions, so database vendors created proprietary functions to extend SQL89's lack of them. And believe me, there are a lot. For instance, Oracle has a bunch of very useful built-in functions for converting and manipulating various values such as the TO_CHAR() function, which takes a date type value and a format specification and outputs a string. With respect to SQL2, ANSI collected the most useful built-in functions from the various vendors and created a standard built-in function set with standard syntax. I will not discuss all of them here. However, what I will include are the functions that I have found most useful in my own applications.

Before I go into detailed discussions of the functions, Table 1 lists the functions and their operations:

Function Name
Parameters
Description
CAST
(value AS data_type)
Cast a value as another data type (i.e., convert a date to a string value)
CURRENT_DATE
n/a
Returns the current system date
LOWER
(string)
Converts string to all lower case
UPPER
(string)
Converts string to all upper case
SUBSTRING
(value FROM n FOR len)
Returns a portion of a string beginning at n-th character, for a length of len
TRIM
(BOTH char FROM string)
Trims char from both ends of a string (could be a space)
TRIM
(LEADING char FROM string)
Trims leading char from string
TRIM
(TRAILING char FROM string)
Trims trailing char from string

Table 1 -- List of common SQL2 Built-in Functions

CAST

Cast is a function I've found highly useful, especially when doing column concatenations in SQL. For instance, in one of my programs I created a report table for which I would be using Crystal Reports © as the reporting tool. But rather than create indexes in code, I decided to concatenate the fields that would make a record unique and use Crystal to sort the records by the resultant field during print. Here's some example code:

sqlEpi := TQuery.Create(Application);
with sqlEpi do
begin
  SQL.Add('SELECT DISTINCT D.*, (((((CAST(D."Cluster" AS VARCHAR(5)) || ');
  SQL.Add('CAST(D."FDate" AS VARCHAR(8))) || CPT4) || ICDX1) || ');
  SQL.Add('ICDX2) || ProvID) AS ClustID,');
  SQL.Add('(CAST(D."Cluster" AS VARCHAR(5)) || ClustProv) As ClustProvID');
  SQL.Add('FROM ":PRIVATE:EPIINIT7" D');
  try
    Open;
  except
    Free;
    Abort;
  end;
end;

As you can see, I used cast on the Cluster and FDate columns to convert them from a numeric and date respectively, to VARCHAR's. Notice that there's no conversion to a STRING type. For strings, you either use CHAR(n) or VARCHAR(n), where n is the size of the output string. I normally use VARCHAR(n) because I'm sometimes I'm not sure exactly how long my string will be, but I usually know the longest length.

CURRENT_DATE, LOWER, and UPPER

These three are all pretty self-explanatory. CURRENT_DATE will get you the current date returned as a Date value. LOWER and UPPER are simple case conversion functions.

SUBSTRING

I'm probably asked how to use SUBSTRING more than any other SQL2 function. Its utility is obvious. But it goes way beyond just returning a substring from a value. SUBSTRING can be used in various ways in SQL. It's such a useful function, I've employed it wherever I can to cut off values. Here are a few examples:

Using SUBSTRING in an UPDATE query:

EpiSQL := TQuery.Create(Application);
with EpiSQL do
begin
  SQL.Clear;
  SQL.Add('UPDATE ":PRIVATE:EPIINIT1.DB"');
  SQL.Add('SET CPT4 = SUBSTRING(CPT4 FROM 1 FOR 4)');
  try
    ExecSQL;
  except
    Free;
    Abort;
  end;
end;

Using SUBSTRING in the SELECT portion of query:

EpiSQL := TQuery.Create(Application);
with EpiSQL do
begin
  SQL.Clear;
  SQL.Add('SELECT D."Ingredient Cost", D."Dispensing Fee", SUBSTRING(NDC FROM 1 FOR 9) AS NDC');
  SQL.Add('FROM "' + extractTable + '" D');
  SQL.Add('WHERE (D."Fill Date" > ''' + fDate + ''')');
  try
    Open;
  except
    Free;
    Abort;
  end;
end;

Using SUBSTRING in the WHERE portion of a query:

EpiSQL := TQuery.Create(Application);
with EpiSQL do
begin
  SQL.Clear;
  SQL.Add('SELECT * FROM "EPIWORK.DB"');
  SQL.Add('WHERE SUBSTRING(ProvId FROM 1 FOR 4) = ''9201''');
  try
    Open;
  except
    Free;
    Abort;
  end;
end;

As you can see, SUBSTRING can be employed in a variety of different ways. But here's something that I should mention: SUBSTRING is not recognized by the InterBase server. To simulate that, you will have to use the LIKE operator in the where clause. Unfortunately, that's the only place where LIKE can be used. A way around this, though, is to make an initial extract from an InterBase table and output to a Paradox or dBase file. SUBSTRING on these types of tables will work.

2004. július 26., hétfő

How to save a password to the registry

Problem/Question/Abstract:

How to save a password to the registry

Answer:

When saving the password for your application in the registry, make sure you implement some level of encryption on the password though. There are a number of algorithms out there. Below is a really easy way that I used:

This function is resposible for rotating the ASCII value of each character in the string. It takes in an integer that represents the direction to rotate the character's value.

function ROT(s: string; direction: integer): string;
var
  i: integer;
begin
  if length(s) < 1 then {if the string is empty then exit}
    exit;
  for i := 1 to length(s) do {iterate the number of characters in string}
    Inc(s[i], direction); {change the value of char by the value of direction}
  result := s; {result becomes altered string}
end;

This routine will read values from keys of the registry:

procedure ReadRegistryValues(Tci: TConnectInfo);
var
  Reg: TRegIniFile;
begin
  Reg := nil;
  {init the variable Reg}
  try
    {attempt to do the following}
    Reg := TRegIniFile.Create;
    {create blank instance}
    Reg.RootKey := HKEY_LOCAL_MACHINE;
    {set Root Key}
    if Reg.OpenKey('SOFTWARE\Futura International, Inc.\FTPUpdater', false) then
      {If the key opens successfully}
    begin
      {fill class with values from the registry}
      Tci.TC_url := ROT(Reg.ReadString(SECTION, 'Name of Key', 'string to be stored'),
        -1);
      Tci.TC_username := ROT(Reg.ReadString(SECTION, 'UserName',
        'string to be stored'), -1);
      Tci.TC_password := ROT(Reg.ReadString(SECTION, 'Password',
        'string to be stored'), -1);
      Tci.TC_username := ROT(Reg.ReadString(SECTION, 'UserName',
        'string to be stored'), -1);
      Tci.TC_passive := Reg.ReadBool(SECTION, 'string to be stored', false);
      Tci.TC_socks_add := Reg.ReadString(SECTION, 'string to be stored', '');
      Tci.TC_socks_password := ROT(Reg.ReadString(SECTION, 'string to be stored', ''),
        -1);
      Tci.TC_socks_port := Reg.ReadString(SECTION, 'string to be stored', '');
      Tci.TC_socks_usercode := Reg.ReadString(SECTION, 'string to be stored', '');
      Tci.TC_socks_version := Reg.ReadInteger(SECTION, 'string to be stored', 0);
    end;
  finally
    {when done with the above}
    Reg.Free;
    {free memory of instance of class}
  end;
end;

This routine will save values to the registry:

procedure SaveToRegistry(Tci: TConnectInfo);
var
  Reg: TRegIniFile;
begin
  Reg := nil;
  {initializing the variable}
  try
    {attempt to do the following}
    Reg := TRegIniFile.Create;
    {create an instance of TRegIniFile}
    Reg.RootKey := HKEY_LOCAL_MACHINE;
    {setting the root key}
    {The call below and in the routine above has a third param. This parameter is used
    to create the key if necessary. I set the value to true here just in case the key did not
    exist, this should only be in the first use of the program.}
    if Reg.OpenKey('SOFTWARE\Futura International, Inc.\FTPUpdater', true) then
      {this rtn will create the key}
    begin
      Reg.WriteString(SECTION, 'URL', ROT(Tci.TC_url, 1));
      {same as above routine except we are writing values instead reading them}
      Reg.WriteString(SECTION, 'UserName', ROT(Tci.TC_username, 1));
      Reg.WriteString(SECTION, 'Password', ROT(Tci.TC_password, 1));
      Reg.WriteBool(SECTION, 'Passive Connect', Tci.TC_passive);
      Reg.WriteString(SECTION, 'Socks Address', Tci.TC_socks_add);
      Reg.WriteString(SECTION, 'Socks Password', ROT(Tci.TC_socks_password, 1));
      Reg.WriteString(SECTION, 'Socks Port', Tci.TC_socks_port);
      Reg.WriteString(SECTION, 'Socks UserCode', Tci.TC_socks_usercode);
      Reg.WriteInteger(SECTION, 'Socks Version', Tci.TC_socks_version);
    end;
  finally
    Reg.Free;
    {free memory alloced to Reg}
  end;
end;

2004. július 25., vasárnap

Detect whether there is a sound card installed

Problem/Question/Abstract:

Detect whether there is a sound card installed

Answer:

Solve 1:

If you need to detect whether there is a sound card installed, your application may call the function 'waveOutGetNumDevs' from the multimedia DLL (winmm.dll, part of a standard installation).

// declare by a nicer functionname..

function SoundCardInstalled: longint; stdcall;
  external 'winmm.dll' name 'waveOutGetNumDevs';

// use like this..

if SoundCardInstalled > 0 then
  Showmessage('A sound card was found.');

Solve 2:

Add MMSystem in the uses of your form

procedure TForm1.Button1Click(Sender: TObject);
begin
  if waveOutGetNumDevs > 0 then
    ShowMessage('Hay tarjeta de sonido instalada' +
      #13 +
      'There is a soundcard installed')
  else
    ShowMessage('No Hay tarjeta de sonido instalada' + #13 +
      #13 +
      'There is not a soundcard installed');
end;

2004. július 24., szombat

Disable the functionality of a TForm to move

Problem/Question/Abstract:

How to disable the functionality of a TForm to move

Answer:

Solve 1:

unit FreezeForm;

interface

uses
  Windows, Messages, Classes, Graphics, Forms, Controls, Buttons,
  StdCtrls, ExtCtrls;

type
  TFrmFreeze = class(TForm)
    BtnValidate: TBitBtn;
    BtnSave: TBitBtn;
    BtnPreview: TBitBtn;
    BtnPrint: TBitBtn;
    BtnExit: TBitBtn;
    BtnHelp: TBitBtn;
    procedure BtnExitClick(Sender: TObject);
  private
    FOldWindowProc: TWndMethod; {Old WindowProc}
    {Window subclassing methods}
    procedure HookForm;
    procedure UnhookForm;
    procedure WndProcForm(var AMsg: TMessage);
  protected
    procedure CreateWnd; override;
  public
    destructor Destroy; override;
  end;

var
  FrmFreeze: TFrmFreeze;

implementation

{$R *.DFM}

procedure TFrmFreeze.CreateWnd;
begin
  inherited;
  if csDesigning in ComponentState then
    Exit; {Don't need to hook when designing}
  if Enabled then
  begin
    HookForm; {Hook the main form's Window}
  end;
end;

procedure TFrmFreeze.HookForm;
begin
  if csDesigning in ComponentState then
    Exit;
  FOldWindowProc := WindowProc;
  WindowProc := WndProcForm;
end;

procedure TFrmFreeze.UnhookForm;
begin
  if csDesigning in ComponentState then
    Exit;
  {If we are "hooked" then undo what Hookform did}
  if Assigned(FOldWindowProc) then
  begin
    if HandleAllocated then
    begin
      WindowProc := FOldWindowProc;
    end;
    FOldWindowProc := nil;
  end;
end;

{WndProcForm is our replacement for our WindowProc. We grab any Windows messages
that we need here}

procedure TFrmFreeze.WndProcForm(var AMsg: TMessage);
var
  cmdType: Word;
begin
  if Enabled then
  begin
    case AMsg.Msg of
      WM_SYSCOMMAND:
        begin
          cmdType := AMsg.WParam and $FFF0;
          case cmdType of
            SC_MINIMIZE, SC_MAXIMIZE, SC_MOVE, SC_SIZE:
              Exit;
          end;
        end;
      WM_GETMINMAXINFO:
        Exit;
    end;
  end;
  {Call the default windows procedure}
  FOldWindowProc(AMsg);
end;

destructor TFrmFreeze.Destroy;
begin
  if not (csDesigning in ComponentState) then
    UnhookForm; {Stop interfering ...}
  inherited Destroy;
end;

procedure TFrmFreeze.BtnExitClick(Sender: TObject);
begin
  Close;
end;

end.

Solve 2:

procedure TCoolHint2KForm.WMNCHitTest(var Message: TWMNCHitTest);
begin
  inherited;
  if Message.Result = htCaption then
    Message.Result := htNowhere;
end;

2004. július 23., péntek

Let them drag and drop files on your program

Problem/Question/Abstract:

Let them drag and drop files on your program

Answer:

If you want to let your users drag and drop files on your program from the File Manager and Windows Explorer, simply add the code inside //>>> and //<<< to your program as in the following example:

unit dropfile;

interface

uses
  Windows, Messages, SysUtils, Classes,
  Graphics, Controls, Forms, Dialogs;

type
  TForm1 = class(TForm)
    procedure FormCreate(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }

    //>>>

    // declare our DROPFILES message handler
    procedure AcceptFiles(var msg: TMessage);
      message WM_DROPFILES;
    //<<<
  end;

var
  Form1: TForm1;

implementation

uses
  //>>>
  //
  // this unit contains certain
  // functions that we'll be using
  //
  ShellAPI;
//<<<

{$R *.DFM}

//>>>

procedure TForm1.AcceptFiles(var msg: TMessage);
const
  cnMaxFileNameLen = 255;
var
  i,
    nCount: integer;
  acFileName: array[0..cnMaxFileNameLen] of char;
begin
  // find out how many files we're accepting
  nCount := DragQueryFile(msg.WParam,
    $FFFFFFFF,
    acFileName,
    cnMaxFileNameLen);

  // query Windows one at a time for the file name
  for i := 0 to nCount - 1 do
  begin
    DragQueryFile(msg.WParam, i,
      acFileName, cnMaxFileNameLen);

    // do your thing with the acFileName
    MessageBox(Handle, acFileName, '', MB_OK);
  end;

  // let Windows know that you're done
  DragFinish(msg.WParam);
end;
//<<<

procedure TForm1.FormCreate(Sender: TObject);
begin
  //>>>
  //
  // tell Windows that you're
  // accepting drag and drop files
  //
  DragAcceptFiles(Handle, True);
  //<<<
end;

end.

Now you can drag and drop files on the form that you registered as a recipient of dropped files by calling the "DragAcceptFiles()" function as in the above example.

2004. július 22., csütörtök

My Delphi help file does not show up

Problem/Question/Abstract:

When you hit F1 in your Delphi IDE or you select in the through the Help menu the item 'Index', do you experience an empty help window? Does the help file appear to you to be corrupted?

Answer:

No, you do not have to worry. There is a simple explanation for this situation. Most likely you have added too many help files to your help index.

The problem is that the WinHelp application has a very limited number of indexes that it can display and you may have just gone beyond its limit.

Unfortunately the only solution is that you will have to remove some "less important" help files from your index.
So open the Help menu, choose "Customize" (this is for Delphi version five, in earlier Delphi versions you need to use an external tool that comes with Delphi) and start removing items from the Index tab that you do not need until the index reappears.

2004. július 21., szerda

How to fill a text with a bitmap

Problem/Question/Abstract:

How to fill a text with a pattern or bitmap? I know that using fillpath(dc) will do it but how can I assign the bitmap or a gradient color to the font?

Answer:

Solve 1:

Here's one method. To make this work, add a TImage and load it with a bmp of about 256 x 256. Make the TImage Visible - False. Drop a TButton on the form. Hook up the OnClick for the button. Change the Form's Font to be something big enough, say around 80 for the size. Also, specify a True Type Font (I used Times New Roman).

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls, ExtCtrls;

type
  TForm1 = class(TForm)
    Image1: TImage;
    Button1: TButton;
    procedure Button1Click(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

procedure TForm1.Button1Click(Sender: TObject);
var
  ClipRegion: HRGN;
  Bmp: TBitmap;
begin
  Bmp := TBitmap.Create;
  Bmp.Width := Image1.Width;
  Bmp.Height := Image1.Height;
  Bmp.Canvas.Brush.Color := clSilver;
  {You could use a different color, or another bitmap}
  Bmp.Canvas.FillRect(RECT(0, 0, Bmp.Width, Bmp.Height));
  BeginPath(Canvas.Handle);
  SetBkMode(Canvas.Handle, TRANSPARENT);
  TextOut(Canvas.Handle, 10, 30, 'Cool!', 5);
  EndPath(Canvas.Handle);
  ClipRegion := PathToRegion(Canvas.Handle);
  SelectClipRgn(Bmp.Canvas.Handle, ClipRegion);
  Bmp.Canvas.Draw(0, 0, Image1.Picture.Bitmap);
  SelectClipRgn(Bmp.Canvas.Handle, 0);
  Canvas.Draw(0, 0, Bmp);
  DeleteObject(ClipRegion);
  Bmp.Free;
end;

end.

Solve 2:

procedure TForm1.Button1Click(Sender: TObject);
var
  dc: hdc;
  SaveIndex: integer;
  bm: TBitmap;
begin
  bm := TBitmap.Create;
  bm.LoadFromFile('c:\download\test.bmp');
  Canvas.Font.Name := 'Arial';
  Canvas.Font.Height := 100;
  dc := Canvas.Handle;
  SaveIndex := SaveDc(Dc);
  SetBkMode(dc, TRANSPARENT);
  BeginPath(dc);
  Canvas.TextOut(0, 0, 'Many TeamB guys ignore me');
  EndPath(dc);
  SelectClipPath(dc, RGN_COPY);
  Canvas.Draw(0, 0, bm);
  RestoreDc(dc, SaveIndex);
  bm.Free;
end;

2004. július 20., kedd

Create a round form

Problem/Question/Abstract:

How can I make a transparent or solid round form, without caption and border?

Answer:

This is a complete example of how to make a round form. Do not forget to create a TButton to close the window

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls,
  Forms, Dialogs, ExtCtrls, Buttons, StdCtrls;

type
  TForm1 = class(TForm)
    Button1: TButton;
    procedure FormCreate(Sender: TObject);
    procedure Button1Click(Sender: TObject);
  private
    { Private-Deklarationen}
    procedure CreateParams(var Params: TCreateParams); override;
  public
    { Public-Deklarationen}
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

{ TForm1 }

procedure TForm1.CreateParams(var Params: TCreateParams);
begin
  inherited CreateParams(Params);

  {Remove caption and border}
  Params.Style := Params.Style or ws_popup xor ws_dlgframe;
end;

procedure TForm1.FormCreate(Sender: TObject);
var
  FormRgn: hRgn;
begin
  {clear form}
  Form1.Brush.Style := bsSolid; //bsclear;
  {make form round}
  GetWindowRgn(Form1.Handle, FormRgn);

  { delete the old object }
  DeleteObject(FormRgn);
  { make the form rectangular }
  Form1.Height := 500;
  Form1.Width := Form1.Height;
  { create the round form }
  FormRgn := CreateRoundRectRgn(1, 1, Form1.Width - 1,
    Form1.height - 1, Form1.width, Form1.height);

  { set the new round window }
  SetWindowRgn(Form1.Handle, FormRgn, TRUE);
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  Form1.close;
end;

end.

2004. július 19., hétfő

How to load a file into a TMemoryStream and set the size of a string to contain the whole file

Problem/Question/Abstract:

I've come across a problem with my file type. I have a record which has an array of bytes. However, the size of this array varies, or can vary depending how big the array needs to be. How can I load the record knowing the size of the array?

Answer:

Here's a code fragment that shows how to load a file into a TMemoryStream and then set the size of a string (an array element in the example below) to contain the whole file. The variable s[i] then contains all the bytes of the file:

procedure TForm1.ButtonCombineClick(Sender: TObject);
var
  i: Integer;
  s: array[1..5] of string;
  size: Integer;
  Stream: TMemoryStream;
begin
  for i := Low(FileList) to High(FileList) do
  begin
    {Load files into strings}
    if FileExists(FileList[i]) then
    begin
      Stream := TMemoryStream.Create;
      try
        Stream.LoadFromFile(FileList[i]);
        SetLength(s[i], Stream.Size);
        Stream.Read(s[i][1], Stream.Size)
      finally
        Stream.Free
      end
    end
    else
      s[i] := '';
  end;
end;
{ ... }

2004. július 18., vasárnap

How to add items to a TComboBox upon an [ENTER] key press (2)

Problem/Question/Abstract:

I wish to display a list. I can easily create the list by adding to Items. Optionally the user can add an item to the list. When I tried it, the OnChange event fired as soon as I pressed one key. I tried using the OnExit - this crashed the app. How can the user type in a string and press Enter and get this added to the list and selected?

Answer:

By unassigning and re-assigning the OnChange event handler for the combobox. Example:

procedure TForm1.Button1Click(Sender: TObject);
begin
  if Trim(Edit1.Text) <> EmptyStr then
    if ComboBox1.Items.IndexOf(Edit1.Text) = -1 then
    begin
      ComboBox1.OnChange := nil;
      ComboBox1.Items.Add(Edit1.Text);
      ComboBox1.ItemIndex := ComboBox1.Items.IndexOf(Edit1.Text);
      ComboBox1.OnChange := ComboBox1Change;
    end;
end;

2004. július 17., szombat

How to detect if a point lies within a polygon (2)

Problem/Question/Abstract:

I have an array of polygons and I would like to check in what polygon my last mouse click occured. How can I do that?

Answer:

This routine checks for being on a line first. Polygon is set for an arbitrary 100 points. You change to your needs. MyPoint is mouse location. You do not need to move off the point with this routine, and thus it works for floats.

{ ... }
var
  Form1: TForm1;
  MyPolygon: array[1..100] of TPoint;
  MyPoint: TPoint;

implementation

{$R *.DFM}

function IsPointInPolygon: Boolean;
var
  i, j, k: Integer;
  LoY, HiY: Integer;
  LoX, HiX: Integer;
  IntersectY: Integer;
  Slope, Intercept: double;
  UseSegment, NotDone: Boolean;
  PointCount: Integer = 1;
  CrossingCount: Integer;
  Remainder: Integer;
begin
  Result := false;
  CrossingCount := 0;
  for i := 1 to PointCount - 1 do
  begin
    {Use only segments whose x values encompass point x}
    LoX := Min(MyPolygon[i].x, MyPolygon[i + 1].x);
    HiX := Max(MyPolygon[i].x, MyPolygon[i + 1].x);
    if ((MyPoint.x >= LoX) and (MyPoint.x <= HiX)) then
    begin
      {See if the x values line up with either endpoint
      a) see if we are on the endpoint exactly if yes inside the polygon and we
                                are done
      b) if lined up with lo indexed point see if line is vertical and if we are
                                on the line - then exit ignore line otherwise
      c) if lined up with hi indexed point see if line is vertical and if we are
                                on the line - then exit ignore line otherwise
         see if next segment creates a parity counting problem. If yes, skip
                                the segment}
      if (MyPoint.x = MyPolygon[i].x) then
      begin {matching lo index}
        if (MyPoint.y = MyPolygon[i].y) then
        begin {is it same as endpoint}
          Result := true;
          Exit;
        end
        else if (MyPolygon[i].x = MyPolygon[i + 1].x) then
        begin {vertical}
          LoY := Min(MyPolygon[i].y, MyPolygon[i + 1].y);
          HiY := Max(MyPolygon[i].y, MyPolygon[i + 1].y);
          {see if y coord is within the segment}
          if ((MyPoint.y >= LoY) and (MyPoint.y <= HiY)) then
          begin
            Result := true;
            Exit;
          end {if on line segment}
          else {vertical, not on line segment}
            UseSegment := false;
        end {if x's match}
        else // not a vertical line but endpoint matches drop it
          UseSegment := false;
      end {if point x matches lower index x}
      else if (MyPoint.x = MyPolygon[i + 1].x) then
      begin {matching hi index}
        {check same stuff as low point first}
        if (MyPoint.y = MyPolygon[i + 1].y) then
        begin {is it same as endpoint}
          Result := true;
          Exit;
        end
        else if (MyPolygon[i].x = MyPolygon[i + 1].x) then
        begin {vertical}
          LoY := Min(MyPolygon[i].y, MyPolygon[i + 1].y);
          HiY := Max(MyPolygon[i].y, MyPolygon[i + 1].y);
          {See if y coord is within the segment}
          if ((MyPoint.y >= LoY) and (MyPoint.y <= HiY)) then
          begin
            Result := true;
            Exit;
          end {if on line segment}
          else {vertical, not on line segment}
            UseSegment := false;
        end {if x's match}
        else
        begin {not a vertical line - but on endpoint}
          {Check the next non vertical segment to handle counting error}
          NotDone := true;
          j := i + 1;
          k := i + 2;
          if k > PointCount then
            k := 1;
          while NotDone do
          begin
            if (MyPolygon[j].x = MyPolygon[k].x) then
            begin {vertical}
              j := j + 1;
              if j > PointCount then
                j := 1;
              k := k + 1;
              if k > PointCount then
                k := 1;
            end
              {not vertical - see if we include it in the count}
            else
            begin
              NotDone := false;
              if (((MyPolygon[i].x < MyPolygon[j].x) and (MyPolygon[k].x >
                MyPolygon[j].x))
                or ((MyPolygon[i].x > MyPolygon[j].x) and (MyPolygon[k].x <
                  MyPolygon[j].x))) then
                UseSegment := true
              else
                UseSegment := false;
            end;
          end; {if not vertical}
        end; {while not done}
      end {if point x matches hi endpoint x}
      else {no endpoint matches - use the segment}
        UseSegment := true;
      if UseSegment then
      begin
        {compute the slope and intercept of non vertical segments that
                                pass the parity test}
        Slope := (MyPolygon[i].y - MyPolygon[i + 1].y) / (MyPolygon[i].x - MyPolygon[i
          + 1].x);
        Intercept := MyPolygon[i].y - (Slope * MyPolygon[i].x);
        {Compute the y value at the point of intersection of a line dropped
                                from MyPoint to the x axis and the segment}
        IntersectY := Trunc((Slope * MyPoint.x) + Intercept);
        {if the intersection is at or below count it}
        if IntersectY <= MyPoint.Y then
        begin
          {debug}
          Form1.Image1.Canvas.Pen.Color := clRed;
          CrossingCount := CrossingCount + 1;
        end;
      end; {if segment is a good candidate}
    end; {if segment x values qualify}
  end; {for all segments in the polygon}
  {last step - see if we crossed an odd number of times}
  Remainder := CrossingCount mod 2;
  if Remainder = 1 then
    Result := true; {odd crossings gets us in}
end;

2004. július 16., péntek

How to implement smart indentation in a TRichEdit

Problem/Question/Abstract:

Is it possible to get the same indenting behaviour as in the Delphi editor into a TRichEdit or a TMemo? When I press enter for a new row I want the cursor to be positioned on the same column as the row above.

Answer:

unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls, ComCtrls;

type
  TForm1 = class(TForm)
    Button1: TButton;
    Button2: TButton;
    RichEdit1: TRichEdit;
    procedure RichEdit1KeyPress(Sender: TObject; var Key: Char);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

uses richedit;

{$R *.DFM}

procedure TForm1.RichEdit1KeyPress(Sender: TObject; var Key: Char);
var
  line, col, indent: integer;
  S: string;
begin
  if key = #13 then
  begin
    key := #0;
    with sender as TRichEdit do
    begin
      {figure out line and column position of caret}
      line := PerForm(EM_EXLINEFROMCHAR, 0, SelStart);
      Col := SelStart - Perform(EM_LINEINDEX, line, 0);
      {get part of current line in front of caret}
      S := Copy(lines[line], 1, col);
      {count blanks and tabs in this string}
      indent := 0;
      while (indent < length(S)) and (S[indent + 1] in [' ', #9]) do
        Inc(indent);
      {insert a linebreak followed by the substring of blanks and tabs}
      SelText := #13#10 + Copy(S, 1, indent);
    end;
  end;
end;

end.

2004. július 15., csütörtök

How to search an Excel worksheet for cells with a particular value

Problem/Question/Abstract:

How to search an Excel worksheet for cells with a particular value

Answer:

If WS is your worksheet:

{ ... }
var
  Rnge: OleVariant;
  { ... }

Rnge := WS.Cells;
Rnge := Rnge.Find('Is this text on the sheet?');
if Pointer(IDispatch(Rnge)) <> nil then
  {The text was found somewhere, so colour it pink}
  Rnge.Interior.Color := clFuchsia;

2004. július 14., szerda

Show the Windows Hierarchy

Problem/Question/Abstract:

There are times when you could do with knowing what different window Handles are for testing messaging apps, etc. Without Loading WinSight (with all it's overheads), or getting WinSpy++, here is a simple digest of Handles, Class Names, and Window Captions.

Answer:

On a Form, place a TreeView Control, and a Button.

Paste in the following 3 procedures/ functions into the implementation Code:

//---

function GetWinInfo(h: HWND): string;
var
  tmp: PChar;
begin
  //Get the HWND value in hex and Decimal
  result := inttohex(h, 8);
  result := result + ' (' + inttostr(h) + ')';
  //Get ClassName, and Window Caption
  //Allow upto 255 Characters
  GetMem(tmp, 255);
  GetClassName(h, tmp, 255);
  result := result + ': ' + tmp;
  tmp[0] := #0;
  GetWindowText(h, tmp, 255);
  result := result + ' - ' + tmp;
  FreeMem(tmp);
end;

procedure GetChildren(h: HWND; n: TTreeNode; T: TTreeview);
var
  Childhw: HWND;
  ChildNode: TTreeNode;
begin
  //Get any Children
  ChildHw := GetWindow(h, GW_CHILD);
  while Childhw > 32 do
  begin
    //Add this Handle
    ChildNode := T.Items.AddChild(n, GetWinInfo(Childhw));
    //Get any Children - Recursive call...
    GetChildren(Childhw, ChildNode, T);
    //Get the next window
    Childhw := GetWindow(Childhw, GW_HWNDNEXT);
  end;
end;

procedure GetWinTree(T: TTreeview);
var
  hw: HWND;
  node: TTreeNode;
begin
  //Loop through all Top Level Windows
  hw := FindWindow(nil, nil);
  while hw > 32 do
  begin
    //Add this Handle
    node := t.items.Add(nil, GetWinInfo(hw));
    //Get any Children
    GetChildren(hw, Node, T);
    //Get the next window
    hw := GetWindow(hw, GW_HWNDNEXT);
  end;
end;

//---

Then put something like this on the ButtonClick Event Handler...

procedure TForm1.Button1Click(Sender: TObject);
begin
  TreeView1.Items.clear;
  GetWinTree(TreeView1);
end;

You will then have a List of All current Window Handles, with all Child Windows listed with then in their correct places.

This could be expanded with searching/ grouping of like classes, etc. But I leave that to you, here is a starting place.

I have used this at various times to get M$ Class names. For instance, if you are using DAO to automatically configure an Access DB to point it's linked tables at a particular SQL Server, I used this to get the Class name of the SQL Server Login form, so that I could search for it and click the OK button before the user gets to it...

2004. július 13., kedd

Prevent a minimized window from restoring until a password is given

Problem/Question/Abstract:

I would like to have a password for windows I minimize. How can I write an application that will prevent any minimized window from restoring until a password is given?

Answer:

interface part:

bPassWordDialog: Boolean;
procedure WMSYSCOMMAND(var Msg: TWMSysCommand); message WM_SYSCOMMAND;
procedure WMActivate(var Msg: TWMActivate); message WM_ACTIVATE;

implementation part:

procedure TForm1.WMActivate(var Msg: TWMActivate);
begin
  if (Msg.Active = WA_Active) and (bPassWordDialog) then
  begin
    bPassWordDialog := False;
    ShowMessage('Show password dialog here');
  end;
  inherited;
end;

procedure TForm1.WMSYSCOMMAND(var Msg: TWMSysCommand);
begin
  bPassWordDialog := False;
  if Msg.CmdType = SC_MINIMIZE then
    bPassWordDialog := True;
  inherited;
end;

2004. július 12., hétfő

Find the contrasting colour

Problem/Question/Abstract:

Is there some kind of function that - given a background color - would return the most appropriate text and highlighted text colors? For example, suppose I have a cell in a stringgrid with some text in > it. Some of the text will be highlighted with a different color. When a user changes the background color for the cell I need a function to change to the most appropriate highlight and text color.

Answer:

It's easy to find a contrasting color for the text; imagine the background color resides inside a three-dimensional color cube with axes labeled red, green, and blue. The corner of the cube farthest away from the background color will always be contrasting. Untested:

function FindContrastingColor(Color: TColor): TColor;
var
  R, G, B: Byte;
begin
  R := GetRValue(Color);
  G := GetGValue(Color);
  B := GetBValue(Color);
  if (R < 128) then
    R := 255
  else
    R := 0;
  if (G < 128) then
    G := 255
  else
    G := 0;
  if (B < 128) then
    B := 255
  else
    B := 0;
  Result := RGB(R, G, B);
end;

As for the highlight color, you might try a hue 180 degrees distant from the background. You'll need to transform the RGB value into HSV, adjust the H value, then transform back to RGB.

2004. július 11., vasárnap

How to set all events of an object to NIL at runtime (2)

Problem/Question/Abstract:

I need to change the 'OnChange' event of all of my components. Can someone give me a starting point as to how I can go about changing this event via RTTI. I want to be able to pass an event name or set the event to nil.

Answer:

uses
  TypInfo;

procedure TFrmRTTIOnChange.Button1Click(Sender: TObject);
var
  propInfo: PPropInfo;
  thisEvent: TNotifyEvent;
begin
  propInfo := GetPropInfo(Memo1.ClassInfo, 'OnChange');
  if propInfo <> nil then
  begin
    thisEvent := Memo1AltChange;
    SetOrdProp(Memo1, PropInfo, integer(@thisEvent));
  end;
end;

procedure TFrmRTTIOnChange.Memo1Change(Sender: TObject);
begin
  Caption := 'Normal On Change';
end;

procedure TFrmRTTIOnChange.Memo1AltChange(Sender: TObject);
begin
  Caption := 'Alternate On Change';
end;

2004. július 10., szombat

How to remove the icon of an application from the Windows Taskbar

Problem/Question/Abstract:

Is there any way to remove the icon of my application from the Windows Taskbar?

Answer:

In the main form's OnShow event add:

ShowWindow(Application.Handle, SW_HIDE);

This will keep the taskbar button from showing. Now when the application is minimized, the button will appear. If you want to remove the button again on restore add to the MainForm's OnCreate event:

Application.OnRestore = Form1OnShow; {This should be the OnShow event}

and the button will rehide on restore.

2004. július 9., péntek

From what should you descend your component?

Problem/Question/Abstract:

From what should you descend your component?

Answer:

The easiest way is to decend from an existing component that has most of the characteristics you are looking for.
If you have to create a custom component from "scratch" then use one of the classes listed below.

TComponent                - The base starting point for non-visual components.
TWinControl                - The base starting point for components that need to have window handles.
TGraphicControl        - A good starting point for visual components that don't need the overhead of a window handle. This class has a Paint method, that should be overridden, but no canvas.
TCustomControl        - The most common starting point for visual components. This class has a Window handle, common events and properties, and most importantly a canvas with a Paint() method.

2004. július 8., csütörtök

Building an Easy-to-Use Parser/Parsing Framework (Part II)

Problem/Question/Abstract:

How to create a simple parsing framework to parse any kind of data?

Answer:

Welcome to the second part of my article "Building an Easy-to-Use Parser/Parsing Framework". This time, I want to show you how to create a real working dtd parser as exemplified in the first part. If you don't read my first article, please make up for this now:

Building an Easy-to-Use Parser/Parsing Framework (Part I)

As mentioned earlier, we need a dtd document which holds up all our parsed informations in an easy-to-access object model. Take a look at the following interface section:

type
  { TDTDAttributeTyp }

  TDTDAttributeTyp =
    (atData, atID, atIDRef, atEnumeration);

  { TDTDAttributeStatus }

  TDTDAttributeStatus =
    (asDefault, asImplied, asRequired, asFixed);

  { TDTDChildTyp }

  TDTDChildTyp =
    (ctElement, ctChoice, ctSequence);

  { TDTDElementTyp }

  TDTDElementTyp =
    (etAny, etEmpty, etData, etContainer);

  { TDTDElementStatus }

  TDTDElementStatus =
    (esRequired, esRequiredSeq, esOptional, esOptionalSeq);

  { TDTDItem }

  TDTDItem = class(TCollectionItem)
  private
    { Private declarations }
    FName: string;
  public
    { Public declarations }
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
    property Name: string read FName write FName;
  end;

  { TDTDItems }

  TDTDItems = class(TCollection)
  private
    { Private declarations }
    function GetItem(Index: Integer): TDTDItem;
    procedure SetItem(Index: Integer; Value: TDTDItem);
  public
    { Public declarations }
    function Add: TDTDItem;
    function Find(const Name: string): TDTDItem;
    property Items[Index: Integer]: TDTDItem read GetItem write SetItem;
    default;
  end;

  { TDTDEntity }

  TDTDEntity = class(TDTDItem)
  private
    { Private declarations }
  public
    { Public declarations }
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
  end;

  { TDTDEntities }

  TDTDEntities = class(TDTDItems)
  private
    { Private declarations }
    function GetItem(Index: Integer): TDTDEntity;
    procedure SetItem(Index: Integer; Value: TDTDEntity);
  public
    { Public declarations }
    function Add: TDTDEntity;
    function Find(const Name: string): TDTDEntity;
    property Items[Index: Integer]: TDTDEntity read GetItem write SetItem;
    default;
  end;

  { TDTDEnum }

  TDTDEnum = class(TDTDItem)
  private
    { Private declarations }
  public
    { Public declarations }
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
  end;

  { TDTDEnums }

  TDTDEnums = class(TDTDItems)
  private
    { Private declarations }
    function GetItem(Index: Integer): TDTDEnum;
    procedure SetItem(Index: Integer; Value: TDTDEnum);
  public
    { Public declarations }
    function Add: TDTDEnum;
    function Find(const Name: string): TDTDEnum;
    property Items[Index: Integer]: TDTDEnum read GetItem write SetItem;
    default;
  end;

  { TDTDAttribute }

  TDTDAttribute = class(TDTDItem)
  private
    { Private declarations }
    FTyp: TDTDAttributeTyp;
    FStatus: TDTDAttributeStatus;
    FDefault: string;
    FEnums: TDTDEnums;
    procedure SetEnums(Value: TDTDEnums);
  public
    { Public declarations }
    constructor Create(Collection: TCollection); override;
    destructor Destroy; override;
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
    property Typ: TDTDAttributeTyp read FTyp write FTyp;
    property Status: TDTDAttributeStatus read FStatus write FStatus;
    property Default: string read FDefault write FDefault;
    property Enums: TDTDEnums read FEnums write SetEnums;
  end;

  { TDTDAttributes }

  TDTDAttributes = class(TDTDItems)
  private
    { Private declarations }
    function GetItem(Index: Integer): TDTDAttribute;
    procedure SetItem(Index: Integer; Value: TDTDAttribute);
  public
    { Public declarations }
    function Add: TDTDAttribute;
    function Find(const Name: string): TDTDAttribute;
    property Items[Index: Integer]: TDTDAttribute read GetItem write
    SetItem; default;
  end;

  { TDTDProperty }

  TDTDProperty = class(TDTDItem)
  private
    { Private declarations }
    FStatus: TDTDElementStatus;
  public
    { Public declarations }
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
    property Status: TDTDElementStatus read FStatus write FStatus;
  end;

  { TDTDProperties}

  TDTDProperties = class(TDTDItems)
  private
    { Private declarations }
    function GetItem(Index: Integer): TDTDProperty;
    procedure SetItem(Index: Integer; Value: TDTDProperty);
  public
    { Public declarations }
    function Add: TDTDProperty;
    function Find(const Name: string): TDTDProperty;
    property Items[Index: Integer]: TDTDProperty read GetItem write
    SetItem; default;
  end;

  { TDTDChild }

  TDTDChilds = class;

  TDTDChild = class(TDTDProperty)
  private
    { Private declarations }
    FTyp: TDTDChildTyp;
    FChilds: TDTDChilds;
    procedure SetChilds(const Value: TDTDChilds);
  public
    { Public declarations }
    constructor Create(Collection: TCollection); override;
    destructor Destroy; override;
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
    property Typ: TDTDChildTyp read FTyp write FTyp;
    property Childs: TDTDChilds read FChilds write SetChilds;
  end;

  { TDTDChilds}

  TDTDChilds = class(TDTDProperties)
  private
    { Private declarations }
    function GetItem(Index: Integer): TDTDChild;
    procedure SetItem(Index: Integer; Value: TDTDChild);
  public
    { Public declarations }
    function Add: TDTDChild;
    function Find(const Name: string): TDTDChild;
    property Items[Index: Integer]: TDTDChild read GetItem write SetItem;
    default;
  end;

  { TDTDElement }

  TDTDElement = class(TDTDProperty)
  private
    { Private declarations }
    FTyp: TDTDElementTyp;
    FAttributes: TDTDAttributes;
    FChilds: TDTDChilds;
    procedure SetAttributes(Value: TDTDAttributes);
    procedure SetChilds(Value: TDTDChilds);
  public
    { Public declarations }
    constructor Create(Collection: TCollection); override;
    destructor Destroy; override;
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
    property Typ: TDTDElementTyp read FTyp write FTyp;
    property Attributes: TDTDAttributes read FAttributes write
      SetAttributes;
    property Childs: TDTDChilds read FChilds write SetChilds;
  end;

  { TDTDElements }

  TDTDElements = class(TDTDProperties)
  private
    { Private declarations }
    function GetItem(Index: Integer): TDTDElement;
    procedure SetItem(Index: Integer; Value: TDTDElement);
  public
    { Public declarations }
    function Add: TDTDElement;
    function Find(const Name: string): TDTDElement;
    property Items[Index: Integer]: TDTDElement read GetItem write
    SetItem; default;
  end;

  { TDTDDocument }

  TDTDDocument = class(TPersistent)
  private
    { Private declarations }
    FEntities: TDTDEntities;
    FElements: TDTDElements;
    procedure SetEntities(Value: TDTDEntities);
    procedure SetElements(Value: TDTDElements);
  public
    { Public declarations }
    constructor Create;
    destructor Destroy; override;
    procedure Assign(Source: TPersistent); override;
  published
    { Published declarations }
    property Entities: TDTDEntities read FEntities write SetEntities;
    property Elements: TDTDElements read FElements write SetElements;
  end;

This model implements all needed objects to parse a dtd file. Notice, that not all dtd grammars are reflected in this model, it's up to you to improve my work - but it's enough to parse all standard dtd files.

Next, we need to create our dtd parser, which will be inherited by TValidationParser as professed in Part I:

type
  { EDTDParser }

  EDTDParser = class(Exception);

  { TDTDParser }

  TDTDParser = class(TValidationParser)
  private
    { Private declarations }
    procedure ParseElement(Parser: TStringParser; Document: TDTDDocument;
      const Pass: Integer);
    procedure ParseAttlist(Parser: TStringParser; Document: TDTDDocument);
    procedure ParseFile(const FileName: string; Document: TDTDDocument;
      const Pass: Integer = 0);
  public
    { Public declarations }
    procedure Parse(const FileName: string; var Document: TDTDDocument);
  end;

The new exception class EDTDParser will be raised, if the passed filename is physical not available. One of the weightily methods is Parse. The first parameter must be an existing filename of the dtd file to be parsed. The second parameter is the document which holds our object model and must be pre-initialized. The implementation of this  method is as followed:

01. procedure TDTDParser.Parse(const FileName: string; var Document: TDTDDocument);
02. var
03.   TmpDocument: TDTDDocument;
04. begin
05.   if not assigned(Document) then
06.     raise EDTDParser.Create('Document not assigned!');
07.   TmpDocument := TDTDDocument.Create;
08.   try
09.     ParseFile(FileName, TmpDocument);
10.     if Errors.Count = 0 then
11.       Document.Assign(TmpDocument);
12.   finally
13.     TmpDocument.Free;
14.   end;
15. end;

In Line 5 we're looking if the passed document was successfully initialized; if not, an exception (EDTDParser) will be raised. After comparing that, we create a new temporary instance of a dtd document (Line 7) and parse the passed filename (Line 9). If no errors occured (Line 10) we make a copy of the filled dtd document by assigning it to the passed one (Line 11).

Consecutively we take a look to the ParseFile procedure, which initializes the main parsing process and looks for the basic keywords: (Note: The italic lines are not part of the sourcecode - they are used to explain the unique sections)

procedure TDTDParser.ParseFile(const FileName: string;
  Document: TDTDDocument; const Pass: Integer = 0);
var
  Parser: TStringParser;
begin
  {Create a new instance of the TStringParser.}
  Parser := TStringParser.Create;
  try
    {Check, if the passed filename already exists.}
    if not Parser.LoadFromFile(FileName) then
    begin
      AddErrorFmt('File "%s" not found', [FileName], Parser);
      Exit;
    end;
    {Initialize an endless loop.}
    while True do
    begin
      {Skip to the next valid Tag-Begin-Token "<" or EOF.}
      while not (Parser.Token in [toEOF, '<']) do
        Parser.SkipToken;
      {Break look, if current Token is EOF - End of File.}
      if Parser.Token = toEOF then
        Break;
      {Get the next Token - after Tag-Begin "<".}
      Parser.SkipToken;
      {Check for valid identification Tag "!" or "?".}
      if Parser.Token <> '!' then
      begin
        {Only add an error if the current Pass is one "1".}
        if not (Parser.Token in ['?']) and (Pass = 1) then
          AddError('InvalidToken', Parser);
        Continue;
      end;
      {Check for valid Symbol or Comment Line.}
      if Parser.SkipToken <> toSymbol then
      begin
        if (Parser.Token <> '-') and (Pass = 1) then
          AddError('InvalidToken', Parser);
        Continue;
      end;
      {Check for "Entity" Tag.}
      if UpperCase(Parser.TokenString) = 'ENTITY' then
        Continue;
      {Check for "Element" Tag.}
      if UpperCase(Parser.TokenString) = 'ELEMENT' then
        ParseElement(Parser, Document, Pass)
      else
        {Check for "Attribute" Tag.} if UpperCase(Parser.TokenString) = 'ATTLIST' then
        begin
          if Pass = 1 then
            ParseAttlist(Parser, Document);
        end
          {Add an error on invalid Symbols.}
        else if Pass = 1 then
          AddErrorFmt('Invalid Symbol "%s"', [Parser.TokenString], Parser);
    end;
    {Initialize Pass 2 - if currently finished Pass 1.}
    if Pass = 0 then
      ParseFile(FileName, Document, 1);
  finally
    Parser.Free;
  end;
end;

The ParseFile method simply starts parsing the main structure of a dtd file and tries to extract some basic keywords like Entity, Element or Attribute. If one of the last two keywords were found, a special (ParseElement or ParseAttlist) method is called to create the corresponding object and to extract additional informations. If the parser founds any syntax or grammar errors, respectively items are created.

The method ParseElement includes the functionality to parse and extract further informations, like Type or Rule:
(Note: The italic lines are not part of the sourcecode - they are used to explain the unique sections)

procedure TDTDParser.ParseElement(Parser: TStringParser;
  Document: TDTDDocument; const Pass: Integer);
var
  Element: TDTDElement;
  Child: TDTDChild;
  Rule: string;
begin
  {Get the next Token.}
  Parser.SkipToken;
  {On first pass, create a new element.}
  if Pass = 0 then
    Element := Document.Elements.Add
      {On second pass, find previous created element.}
  else
    Element := Document.Elements.Find(Parser.TokenString);
  {Set the new element name.}
  Element.Name := Parser.TokenString;
  try
    {Add an error if the current Token isn't a symbol.}
    if Parser.Token <> toSymbol then
      Abort;
    {Check for element rule, like "any", "empty" or "sequence"...}
    Rule := UpperCase(Parser.SkipTokenString);
    {...Found Rule: "ANY".}
    if (Rule = 'ANY') and (Parser.SkipToken = '>') then
    begin
      Element.Typ := etAny;
      Exit;
    end;
    {...Found Rule: "EMPTY".}
    if (Rule = 'EMPTY') and (Parser.SkipToken = '>') then
    begin
      Element.Typ := etEmpty;
      Exit;
    end;
    if (Rule = '(') then
    begin
      {...Found Rule: "PCDATA".}
      if Parser.SkipToken in [toEOF, '>'] then
        Abort;
      if Parser.Token = '#' then
      begin
        if UpperCase(Parser.SkipToToken('>')) = 'PCDATA)' then
        begin
          Element.Typ := etData;
          Exit;
        end;
        Abort;
      end;
      {...Found Rule: "sequence/container".}
      Element.Typ := etContainer;
      repeat
        {Create Child objects, if pass = 1.}
        Child := nil;
        if not (Parser.Token in ['|', ',', ')']) then
        begin
          if Pass = 0 then
          begin
            Child := Element.Childs.Add;
            Child.Name := Parser.TokenString;
            Child.Typ := ctElement;
          end
          else if Document.Elements.Find(Parser.TokenString) = nil then
            AddErrorFmt('Invalid Element Target "%s"', [Parser.TokenString], Parser);
        end;
        Parser.SkipToken;
        {Check Child Status (=sequence style).}
        if Parser.Token in ['+', '?', '*'] then
        begin
          if Child <> nil then
            case Parser.Token of
              '+':
                Child.Status := esRequiredSeq;
              '?':
                Child.Status := esOptional;
              '*':
                Child.Status := esOptionalSeq;
            end;
          Parser.SkipToken;
        end;
      until Parser.SkipToken in [toEOF, '>'];
      Exit;
    end;
    {Add an error only on pass 1.}
    if Pass = 1 then
      AddErrorFmt('Invalid Element Rule "%s"', [Rule], Parser);
  except
    {Add an error only on pass 1.}
    if Pass = 1 then
      AddError('InvalidElementFormat', Parser);
  end;
end;

The method ParseAttlist includes the functionality to parse and extract further informations, like Type or Enumerations: (Note: The italic lines are not part of the sourcecode - they are used to explain the unique sections)

procedure TDTDParser.ParseAttlist(Parser: TStringParser; Document: TDTDDocument);
var
  Attribute: TDTDAttribute;
  Element: TDTDElement;
  Target, Typ: string;
begin
  {Get the next Token.}
  Target := Parser.SkipTokenString;
  try
    {Add an error if the current Token isn't a symbol.}
    if Parser.Token <> toSymbol then
      Abort;
    {Try to find the element target.}
    Element := Document.Elements.Find(Target);
    {Add an error if no element was found.}
    if Element = nil then
    begin
      AddErrorFmt('Invalid Element Target "%s"', [Target], Parser);
      Exit;
    end;
    {Get the next Token.}
    Parser.SkipToken;
    repeat
      {Add an error if the current Token isn't a symbol.}
      if Parser.Token <> toSymbol then
        Abort;
      {Create a new Attribute under the located element.}
      Attribute := Element.Attributes.Add;
      {Set the new name.}
      Attribute.Name := Parser.TokenString;
      {Check for Attribute Type...}
      Typ := Parser.SkipTokenString;
      {...Found Type "CDDATA".}
      if UpperCase(Typ) = 'CDATA' then
        Attribute.Typ := atData
      else
        {...Found Type "ID".} if UpperCase(Typ) = 'ID' then
          Attribute.Typ := atID
        else
          {...Found Type "IDREF".} if UpperCase(Typ) = 'IDREF' then
            Attribute.Typ := atIDRef
          else
            {...Found Type "enumeration".} if Typ = '(' then
            begin
              Attribute.Typ := atEnumeration;
              {Seperate enumeration parts and attach them}
              {to the parent attribute.}
              repeat
                Parser.SkipToken;
                if not (Parser.Token in ['|', ')']) then
                  Attribute.Enums.Add.Name := Parser.TokenString;
              until Parser.Token in [toEOF, ')'];
              {Add an error, if current token is "EOF".}
              if Parser.Token = toEOF then
              begin
                AddErrorFmt('Invalid Enumeration End in Attribute "%s"',
                  [Attribute.Name], Parser);
                Exit;
              end;
            end
            else
            begin
              AddErrorFmt('Invalid Attribute Typ "%s"', [Typ], Parser);
              Exit;
            end;
      {Check for Restrictions...}
      Parser.SkipToken;
      if Parser.Token = '#' then
      begin
        {...Found Restriction "IMPLIED".}
        Typ := UpperCase(Parser.SkipTokenString);
        if Typ = 'IMPLIED' then
        begin
          Attribute.Status := asImplied;
          Parser.SkipToken;
        end;
        {...Found Restriction "REQUIRED".}
        if Typ = 'REQUIRED' then
        begin
          Attribute.Status := asRequired;
          Parser.SkipToken;
        end;
        {...Found Restriction "FIXED".}
        if Typ = 'FIXED' then
        begin
          Attribute.Status := asFixed;
          Parser.SkipToken;
        end;
      end;
      {Extract an optional default value.}
      if Parser.Token = '"' then
      begin
        if Attribute.Status = asImplied then
          Abort;
        Attribute.Default := Trim(Parser.SkipToToken('"'));
        Parser.SkipToken;
      end;
    until Parser.Token = '>';
  except
    AddErrorFmt('Invalid Attribute Format "%s"', [Target], Parser);
  end;
end;

Note: The above methods only detects simple dtd grammas. To parse all possible tags and additional grammars you had to include a more complex algorithm to do that - for our purposes (and this article) it's enough. If you are not familiar with the dtd syntax, check out the site W3Schools.

Okay, at this point we have finished our object-model and parser implementation. All we need now is an example application which will take use of this units. Our demo application will parse a dtd file, detects the structure and creates a simple xml output with a given startup node. Take a look at the following dtd:

<!ELEMENT Extension EMPTY>
<!ATTLIST Extension
name CDATA #REQUIRED
value CDATA #REQUIRED
>
<!ELEMENT Code (#PCDATA)>
<!ELEMENT Message (#PCDATA)>
<!ELEMENT Status (Code, Message?)>
<!ATTLIST Status
Type (Error | Warning | Information) #REQUIRED
>
<!ELEMENT BekoId (#PCDATA)>
<!ELEMENT BeraBeratungID (#PCDATA)>
<!ELEMENT BeratungsKontextResp (BekoId, BeraBeratungID, Status, Extension*)>

Our demo application will create the following xml output:

<?xml version='1.0'?>
<!DOCTYPE BeratungsKontextResp SYSTEM 'sample.dtd'>

<BeratungsKontextResp>
  <BekoId></BekoId>
  <BeraBeratungID></BeraBeratungID>
  <Status Type="">
    <Code></Code>
    <Message></Message>
  </Status>
  <Extension name="" value=""></Extension>
</BeratungsKontextResp>

In this case, the startup node is BeratungsKontextResp which will be used as the root node for all other nodes. Our example is implemented as a console application as followed:

program dtd2xml;

{$APPTYPE CONSOLE}

uses
  SysUtils,
  DTD_Parser in 'DTD_Parser.pas',
  DTD_Document in 'DTD_Document.pas',
  StringParser in 'StringParser.pas',
  PrivateParser in 'PrivateParser.pas';

var
  FileName: string;
  Switch_XMLRoot: string;
  Switch_XMLData: Boolean;
  Switch_RootLst: Boolean;
  DTDDocument: TDTDDocument;
  DTDParser: TDTDParser;
  RootElement: TDTDElement;
  i: Integer;

  {-----------------------------------------------------------------------------
    Procedure: FindCmdSwitch
    Author:    mh
    Date:      23-Jan-2002
    Arguments: const Switch: string; const Default: string = ''
    Result:    string
  -----------------------------------------------------------------------------}

function FindCmdSwitch(const Switch: string; const Default: string = ''): string;
var
  i: Integer;
begin
  Result := '';
  for i := 1 to ParamCount do
    if UpperCase(Copy(ParamStr(i), 1, Length(Switch))) = UpperCase(Switch) then
    begin
      Result := Copy(ParamStr(i), Length(Switch) + 1, MAXINT);
      Exit;
    end;
  if Result = '' then
    Result := Default;
end;

{-----------------------------------------------------------------------------
  Procedure: WriteXML
  Author:    mh
  Date:      23-Jan-2002
  Arguments: const AElement: TDTDElement; const AStatus: TDTDElementStatus; Indent: Integer = 0
  Result:    None
-----------------------------------------------------------------------------}

procedure WriteXML(const AElement: TDTDElement; const AStatus: TDTDElementStatus;
  Indent: Integer = 0);
var
  i: Integer;
  Spacer, Def: string;
begin
  for i := 1 to Indent * 2 do
    Spacer := Spacer + #32;
  Write(Spacer + '<' + AElement.Name);
  for i := 0 to AElement.Attributes.Count - 1 do
    with AElement.Attributes[i] do
    begin
      Def := Default;
      if (Switch_XMLData) and (Def = '') then
      begin
        if Typ = atEnumeration then
        begin
          if Enums.Count > 0 then
            Def := Enums[0].Name
          else
            Def := '???';
        end
        else
          Def := Name;
      end;
      Write(Format(' %s="%s"', [Name, Def]));
    end;
  if AElement.Typ <> etContainer then
  begin
    Def := '';
    if (Switch_XMLData) and (AElement.Typ <> etEmpty) then
      Def := AElement.Name;
    WriteLn(Format('>%s', [Def, AElement.Name]));
  end
  else
    WriteLn('>');
  for i := 0 to AElement.Childs.Count - 1 do
    WriteXML(DTDDocument.Elements.Find(AElement.Childs[i].Name),
      AElement.Childs[i].Status, Indent + 1);
  if AElement.Typ = etContainer then
    WriteLn(Spacer + Format('', [AElement.Name]));
end;

{-----------------------------------------------------------------------------
  Procedure: main
  Author:    mh
  Date:      23-Jan-2002
  Arguments: None
  Result:    None
-----------------------------------------------------------------------------}
begin
  // display usage.
  if (ParamCount = 0) or (FindCmdSwitch('-?', '?') <> '?') then
  begin
    WriteLn('');
    WriteLn('dtd2xml (parser framework example) version 1.0');
    WriteLn('(w)ritten 2002 by Marc Hoffmann. GNU License');
    WriteLn('');
    WriteLn('Usage: dtd2xml [options] [-?]');
    WriteLn('');
    WriteLn('Options:');
    WriteLn('-xmlroot=           XML root element (? = possible elements)');
    WriteLn('-xmldata=yes|no           Include XML Example data (default = yes)');
    WriteLn('');
    Exit;
  end;

  // exract filename.
  FileName := ParamStr(1);

  // append default extenstion,
  if ExtractFileExt(FileName) = '' then
    FileName := ChangeFileExt(FileName, '.dtd');

  // file exists?
  if not FileExists(FileName) then
  begin
    WriteLn(Format('Fatal: File not found ''%s''.', [FileName]));
    Exit;
  end;

  // extract command-line switches.
  Switch_RootLst := FindCmdSwitch('-xmlroot=') = '?';
  Switch_XMLRoot := FindCmdSwitch('-xmlroot=');
  Switch_XMLData := UpperCase(FindCmdSwitch('-xmldata=')) <> 'NO';

  // create new dtd-document.
  DTDDocument := TDTDDocument.Create;
  try
    // create new dtd-parser.
    DTDParser := TDTDParser.Create;
    try
      // parse file.
      DTDParser.Parse(FileName, DTDDocument);

      // display possible errors.
      if DTDParser.Errors.Count > 0 then
      begin
        for i := 0 to DTDParser.Errors.Count - 1 do
          with DTDParser.Errors[i] do
            WriteLn(Format('Error in Line %d, Pos %d: %s...', [Line, Position,
              Message]));
        Exit;
      end;

      // search rootelement.
      RootElement := DTDDocument.Elements.Find(Switch_XMLRoot);

      // display rootelements & assign possible object.
      for i := 0 to DTDDocument.Elements.Count - 1 do
        if DTDDocument.Elements[i].Typ = etContainer then
        begin
          if Switch_RootLst then
            WriteLn(DTDDocument.Elements[i].Name)
          else if (Switch_XMLRoot = '') and ((RootElement = nil) or ((RootElement <>
            nil)
            and (RootElement.Childs.Count < DTDDocument.Elements[i].Childs.Count)))
              then
            RootElement := DTDDocument.Elements[i];
        end;

      // exit app if rootlist-switch was set.
      if Switch_RootLst then
        Exit;

      // exit app if rootelement is NIL.
      if RootElement = nil then
      begin
        WriteLn(Format('Fatal: Root Element ''%s'' not found.', [Switch_XMLRoot]));
        Exit;
      end;

      // exit app if rootelement is invalid.
      if RootElement.Typ <> etContainer then
      begin
        WriteLn(Format('Fatal: ''%s'' is not a valid Root Element.',
          [Switch_XMLRoot]));
        Exit;
      end;

      // write xml output.
      WriteLn(Format('' + #13 + '', [RootElement.Name, ExtractFileName(FileName)]));
      WriteLn('');
      WriteXML(RootElement, RootElement.Status);

      // free dtd-parser.
    finally
      DTDParser.Free;
    end;

    // free dtd-document.
  finally
    DTDDocument.Free;
  end;
end.

2004. július 7., szerda

Paradox limits

Problem/Question/Abstract:

Paradox limits

Answer:

Table and Index Files
127
Tables open per system
64
Record locks on one table (16Bit) per session
255
Record locks on one table (32Bit) per session
255
Records in transactions on a table (32 Bit)
512
Open physical files
(DB, PX, MB, X??, Y??, VAL, TV)
300
Users in one PDOXUSRS.NET file
255
Number of fields per table
255
Size of character fields
2
Billion records in a table
2
Billion bytes in .DB (Table) file
10800
Bytes per record for indexed tables
32750
Bytes per record for non-indexed tables
127
Number of secondary indexes per table
16
Number of fields in an index
255
Concurrent users per table
256
Megabytes of data per BLOB field
100
Passwords per session
15
Password length
63
Passwords per table
159
Fields with validity checks (32 Bit)
63
Fields with validity checks (16 Bit)

2004. július 6., kedd

Versioning: The InterBase Advantage

Problem/Question/Abstract:

Versioning: The InterBase Advantage

Answer:

An Examination of Database Concurrency Models

As you rush headlong into the world of client/server computing, one of the first things you must do is select a database server. The architectures of database servers vary widely, and as a result, their behavior in a given situation also varies widely.

This means that to select the appropriate server for your Delphi database application you must understand two things:

how data will be accessed and modified in your application, and
how the server will behave in each data access or update situation.

In this article we'll explore the issues that affect concurrent access to data, as well as how they may impact your application.

Locking Schemes

The oldest and most common method of controlling concurrent access to data by several users is locking. When a user locks an object in a database, he or she restricts other users' ability to access that object.

How much a lock affects concurrency depends on the lock's granularity. For example, a lock placed on an entire table will restrict other users' access to all the records in the table. Therefore, a table-level lock has very low granularity. A lock placed on a single page in a table limits access to all the records on that page. A page-level lock is more granular than a table-level lock. By contrast, a lock placed on a single row is very granular, and provides the minimum restriction to concurrent data access.

Most database servers support either row- or page-level locking. The problem with page-level locks is easy to understand by looking at an example. Suppose a page's size is 2KB (2048 bytes) and a row's size is 100 bytes. Thus, each page can hold 20 rows, and each time a page is locked, access is restricted to all 20 rows. With row-level locking only a single row would be locked, and other users could freely access other records on the page. Thus, row-level locking provides better concurrency.

Pessimistic Locking. If your background is in desktop databases (e.g. Paradox), you're probably familiar with pessimistic locking. This scheme is so named because it assumes there's a high probability that another user will try to modify the same object in the database you're changing. In a pessimistic locking environment, the object you want to change is locked before you begin changing it, and the object remains locked until your change is committed. The advantage of pessimistic locking is that you're guaranteed the ability to commit the changes you make.

Let's say you need to change a customer's address. Using pessimistic locking, you first lock the customer information at either the page or row level. You can then read the customer's record, change it, and be guaranteed that you can write your changes to the database. Once you commit your changes, your lock is released and others are free to change the customer's record. Locks can persist for a long time when pessimistic locking is used. For example, you could begin a change and then take a lunch break. After returning, you can then commit the change and release the lock.

Clearly, you want to use locks with high granularity if you're going to use pessimistic locking in a multi-user environment. If you must lock an entire page of customer records while changing a single row, then no other user can change any other customer record on that page. Row-level locks are best when pessimistic locking is used. This is because they impose the least restriction on access by other users. Page-level locks are much less satisfactory, because as long as they persist, they restrict access to many rows.

Optimistic Locking. The most common locking scheme found in database servers (e.g. Oracle, Sybase, SQL Server) is optimistic locking. The locking mechanism is optimistic in that it assumes it's unlikely another user will try to change the same row you're changing. An optimistic lock is not placed until you're done committing your changes.

To understand optimistic locking consider two users, Fred and Ethel, who are trying to change a customer's record. First, Fred reads the record and begins to make changes. Next Ethel reads the same record and begins to make changes. This is possible because in the optimistic locking scheme no lock is placed when a user reads a record and begins changing it.

Then Fred completes his changes and attempts to commit them. The database locks the record, commits the changes, and releases the lock. When Ethel tries to commit her changes, the software detects that the record has been changed since she'd read it. Ethel's change is rejected, and she must re-read the record and begin again.

Optimistic locking has a clear advantage because locks are only held for a brief period while the data is updated. This means that with an optimistic locking scheme you can achieve adequate concurrency with less lock granularity. Therefore, databases that use optimistic locking may lock at the page level and not at the row level. Conversely, optimistic locking does not fare well in an environment where there's a high probability that two users will simultaneously try to update the same row.

From the database vendor's point of view, page-level locking is advantageous because fewer locks must be placed - particularly during batch operations that affect many rows. This means the resource requirements of the lock manager module in the database management system are lower, and this can help improve the performance of the database server. However, users are invariably the slowest part of any database application, so you'll usually get better overall performance in an environment where one user cannot block another.

Why You Should Care

Understanding how your database manages locks can be critically important. Consider an Orders table. New records are added continuously as new orders are received. Because the Order data does not include a field (or fields) that would form a natural primary key, you decide to use an artificially generated Order Number as a surrogate key. Order Numbers will be assigned sequentially as orders are received.

Because your application must frequently select groups of orders, you create a clustered index on the Order Number column. A clustered index provides superior performance when retrieving adjacent records. This is because the records are physically stored in key order within the database pages.

Unfortunately, this design will probably produce poor performance if the database uses page-level locking. Because sequential adjacent keys are being assigned and a clustered index is being used, each new record added will probably be placed on the same page as the preceding record. Because the database locks at the page level, two users cannot add new orders to the same page simultaneously. Each new order must wait until the page lock placed by the preceding order is released. In this case you would get much better performance by randomly assigning the keys. This will reduce the chance that successive records will be added to the same page.

Transactions

Database servers also require the ability to group changes to the database into transactions. Transactions consist of one or more changes to one or more tables in the database that must be treated as a single unit. This is so that either all or none of the changes that comprise the transaction occur.
Transaction processing occurs in three steps:

First, tell the database you want to begin a transaction. This informs the database that all changes - until further notice - are to be treated as a single unit.
Next, the changes are made to the tables in the database.
Finally, notify the database system that you want to either commit or rollback the transaction. If you commit the transaction, the changes become permanent. All the changes are "undone" with a rollback.

Transaction processing is vital to ensure the database's logical integrity. Let's say that Fred transfers $100 from his savings account to his checking account. This transaction would proceed as follows:

Start a new transaction.
Update the savings account balance to show a withdrawal of $100.
Update the checking account balance to reflect an increase of $100.
Either commit or rollback the transaction.

Suppose the system crashes after step 2, but before step 3. Without transaction control, Fred would have lost $100. With transaction control, when the system is restarted, the database management system (DBMS) will automatically rollback any transactions not committed at the time of the system's crash. This guarantees that the database will be left in a consistent state.

You also need transaction control for read transactions that will read more than a single record. This is to ensure that the read returns a consistent view of the data. We'll discuss this requirement in more detail in the next section.

Transaction Isolation

Transaction isolation governs how simultaneously executing transactions interact with each other. Many of today's database servers were originally designed to process short update transactions intermixed with single row reads.

The perfect example of this is an automated teller machine (ATM). An ATM reads the balance in a single account, or updates the balance in one or more accounts. In this environment, transactions are short, and reads involve a single row at a time, so transaction isolation is not a serious concern. However, many of today's database applications do not fit this model.

Short update transactions are still the norm. However, the advent of executive information systems has introduced long running read transactions that span entire tables - sometimes entire databases.

Let's consider the following scenario. An executive requests the total value of the company's inventory by warehouse. While the query is scanning the inventory table, a user moves a pallet of platinum bars from warehouse A to warehouse B and commits the transaction. It's possible for the query to count the platinum in both warehouses, thus producing an erroneous inventory valuation report.

The question becomes, "Which updates should a read transaction see, and when should it see them?" This is what the transaction isolation level controls. There are three basic isolation levels:

Dirty Read - This isolation level allows any record in the database to be read whether or not it has been committed.
Read Committed - This level allows read transactions to see only those changes that were committed.
Repeatable Read - A repeatable read allows the read transaction to immediately see a snapshot of the database when the transaction began. Neither committed nor uncommitted updates that occur after the read transaction starts will be seen.

Note that the TransIsolation property of Delphi's TDatabase component allows you to set all three of these isolation levels. However, this doesn't mean that your server supports the isolation level you have selected. In addition, if you're using an ODBC driver, the driver must also support the isolation level you set. Search on "Transactions | Transaction Isolation Levels" in the Delphi online Help to view a table showing what each of these isolation levels maps to on your server.

In the example above, you need a repeatable read isolation to ensure the accuracy of your inventory valuation report. The problem is the price you must pay to get repeatable read in a database with a locking architecture. With the locking model, the only way to ensure that data does not change during a long read transaction is to prevent any updates from occurring until the read transaction ends. In many situations, the effect on users of stopping all updates for the duration of a long read transaction is unacceptable.

Versioning

Versioning is another model for concurrency control. It overcomes the problems that locking model databases have when the environment consists of a mixture of update and long read transactions. This model is called the versioning model. To date, InterBase is the only DBMS to use the versioning model.

Let's reconsider the preceding example. The read transaction to produce the inventory valuation report begins. When the update transaction to move the pallet of platinum from warehouse A to warehouse B is committed, a new version of each updated record is created. However, the old versions still exists in the database.

In a versioning database, each transaction is assigned a sequential transaction number. In addition, the DBMS maintains an inventory of all active transactions. The transaction inventory pages show whether the transaction is active, committed, or rolled back.

When an update transaction commits, the DBMS checks if there are transactions with lower transaction numbers that are still active. If so, a new version of the record is created that contains the updated values. Each version also contains the transaction number of the transaction that created it.

When a read transaction begins, it retrieves the next transaction number and a copy of the transaction inventory pages that show the status of all uncommitted transactions. As a read transaction requests each row in a table, the DBMS checks if the transaction number for the latest version of the row is greater than the transaction number of the transaction that's requesting it. The software also checks if the transaction was committed when the read transaction started.

Let's say the transaction number of the row's latest version is greater than the requesting transaction's number; or, the transaction which created the latest version was active when the read transaction started. With either scenario, the DBMS looks back through the chain of prior versions. The software continues until it encounters a version with a transaction number that is less than the transaction number of the transaction that is trying to read the row and whose transaction status was committed when the read transaction started.

When the DBMS finds the most recent version that meets these criteria, it returns that version. The result is repeatable read transaction isolation without preventing updates during the life of the read transaction.

Consider the following example of a row for which four versions exist:

Tran=100 (status=committed)
   Tran=80 (status=active when read started)
      Tran=60 (status=rolled back)
         Tran=40 (status=committed when read started)

Assume that a read transaction with transaction number 90 attempts to read this row. The read transaction will not see the version of the row created by transaction 100 because the update that created this version took place after transaction 90 began. Also, transaction 90 cannot read the version created by transaction 80, even though it has a lower transaction number. This is because transaction 80 isn't yet committed. Although the version for transaction 60 still exists on disk, transaction 60 has rolled back - and rolled back versions are always ignored. Therefore, the version that transaction 90 will read is the version created by transaction 40.

Note that in this example, transaction 80 is not allowed to commit. When transaction 80 attempts to commit, the DBMS will discover that transaction 100 has committed, and transaction 80 will be rolled back.

Advantages of Versioning

For a more complete understanding of how the locking and versioning models compare you must examine two things:

the types of concurrency conflicts that can occur in a multi-user database, and
how each model behaves in each case.

The following examples assume that the locking model uses a shared read lock and an exclusive write lock to implement optimistic locking. Multiple users can place read locks, but no user can place a write lock if another user has either a read or write lock. If one user has a write lock, another user can neither read nor write the row. This is typical of databases that use locking architecture.

Consider the case where a husband and wife go to different ATMs at the same time to withdraw money from their joint checking account. Without concurrency control, the following sequence of events occurs:

Fred reads the account's balance as $1,000.
Ethel reads the account's balance as $1,000.
Fred posts a $700 withdrawal.
Ethel posts a $500 withdrawal.

At this point, the account balance is -$200 and the bank is not happy. This happened because without a concurrency control mechanism, Fred's update is lost as far as Ethel is concerned. She never sees the change in the account balance. However, under the locking model:

Fred reads the account's balance causing a read lock.
Ethel reads the account's balance, also causing a read lock.
Fred posts his withdrawal attempting a write lock that fails because of Ethel's read lock.
Ethel posts her withdrawal attempting a write lock that fails because of Fred's read lock.

A deadlock now exists. Hopefully, the DBMS will detect the deadlock and rollback one of the transactions.

Under the versioning model, Fred reads the account's balance and Ethel reads the account's balance. Then, Fred posts his withdrawal, which causes a new version with a new balance to be written. When Ethel posts her withdrawal, it's rolled back when the newer version is detected.

A different problem occurs if a user does not commit a transaction. Let's say Fred withdraws money from the account and this updates the balance. Ethel reads the balance and Fred cancels the transaction before committing. Now Ethel has seen the wrong balance. In this case, a dependency exists between the two transactions. Ethel's transaction produces the correct results only if Fred's transaction commits. This illustrates the danger of reading uncommitted data.

Using locking, Fred reads the balance that places a read lock, and then commits his withdrawal that places a write lock during the update. Ethel reads the balance, which attempts a read lock but must wait because of Fred's write lock. Fred cancels the transaction before committing. This rolls back and releases the write lock. Ethel can now read and get the correct balance.

Under versioning, Fred withdraws the money. This updates the balance and creates a new uncommitted version. At her machine, Ethel reads the balance, but it does not reflect Fred's uncommitted withdrawal. Fred rolls back, so the version showing the withdrawal is marked rolled back. This illustrates a performance advantage of versioning because Ethel does not have to wait to read the balance.

The following is a different example, but it's the same as our earlier scenario of moving platinum from one warehouse to another:

Fred requests the total of all accounts.
Ethel transfers money from savings to checking while Fred's transaction is running.
Fred receives the wrong total. The analysis of the data is inconsistent because the data's state was not preserved throughout the life of the read transaction.

Under locking, Fred requests a total of all accounts, thereby placing a read lock. Ethel transfers money but cannot place a write lock to commit the transfer because of Fred's read lock. Ethel must wait until the read transaction finishes. Finally, Fred gets the right total and releases the read lock and Ethel's transaction can proceed.

Under versioning, Fred requests the total. At her ATM, Ethel transfers money from savings to checking, resulting in new versions which Fred's transaction does not see. Fred gets the correct total and Ethel's update is not delayed.

Another variation of the repeatable read problem occurs if you must reread the data in the course of the transaction. For example:

A query is started for all rows meeting certain criteria.
Another user inserts a new row that meets the criteria.
Repeat the query and you will get one additional row. The appearance of this "phantom row" is not consistent within the transaction.

With a database that uses the locking model, the only way to prevent this inconsistency is to read lock the whole table for the duration of the transaction. Thus the sequence of events is:

Place a read lock on the table.
Query for all records meeting certain criteria.
Another user attempts to insert a record, but is blocked by the table-level read lock.
Repeat the query and you'll get the same results because other users cannot commit changes.

Under versioning there's no problem, because the newly inserted record has a higher transaction number than the read transaction. Therefore, it's ignored on the second and subsequent reads that are part of the same transaction.

Disadvantages of Versioning

So far it looks as if the versioning model handles most concurrency conflicts better than the locking model. However, this is not always the case. In this example, Fred and Ethel are both told to make their salaries equal:

Fred reads his salary.
Ethel reads her salary.
Fred sets Ethel's salary equal to his.
Ethel sets Fred's salary equal to hers.

Under versioning, the result is that their salaries are simply swapped. Using locking you can prevent this by locking both records. For example, both Fred and Ethel read their own salaries and place read locks. Fred sets Ethel's salary equal to his, but cannot commit because of Ethel's read lock. Likewise, Ethel sets Fred's salary equal to hers, but cannot commit because of Fred's read lock.

Once again, you have a deadlock that the database system should resolve by rolling back one transaction. Another solution using locking is to write lock the entire table. For example, Fred write locks the table and reads his salary. Ethel then tries to read her salary, but is blocked by Fred's table-level write lock. Fred sets Ethel's salary equal to his and releases the write lock. Ethel's transaction is now free to proceed.

Under versioning, Fred reads his salary and Ethel reads hers. Fred sets Ethel's salary equal to his and commits. Then Ethel sets Fred's salary equal to hers and commits. Once again the salaries are swapped, because versioning allows both transactions to process concurrently. The only way to solve this problem with the versioning model is as follows:

Fred reads his salary.
Ethel reads her salary.
Fred sets Ethel's salary equal to his.
Fred sets his salary to itself, creating a newer version.
Ethel sets Fred's salary equal to hers, but it rolls back because a newer version exists.

Here the problem is solved by setting Fred's salary equal to itself. This forces the creation of a new record version for Fred's salary. Versioning architecture will not allow a change to be committed when a version of the record to be updated exists (which was created after the start of the current transaction). Therefore, Ethel's update rolls back.

Recovery

One very important issue in any database application is recovery time when the server crashes. No matter how robust your hardware and software and/or how reliable your electric power supply, there's always a possibility the server will fail. Both locking and versioning databases will recover automatically when the server is restarted. However, there's a significant difference in the recovery time.

Locking-model databases write each transaction to a log file. To recover after a crash, the DBMS must read the log file and rollback all the transactions that were active at the time of the crash by copying information from the log to the database.

A versioning database does not have a log file. The record versions in the database already provide all the information required to recover. No data needs to be copied from one place to another. Instead, when the DBMS comes back on line, it simply goes through the transaction inventory pages and changes the status of all active transactions to rolled back. At most this will take a few seconds, even on a large database or one with a large number of active transactions. Thus, crash recovery is another area where the versioning model excels.

Other Issues

At first it may appear that a versioning database has a significant disadvantage. This is because the multiple record versions will cause the database size to temporarily increase rapidly compared to a locking database. While this is true, don't forget that other databases also grow as their log files expand.

However, versioning databases will certainly grow rapidly if something is not done to control the proliferation of record versions. The DBMS performs some of the housekeeping for you automatically. Each time a record is accessed, the DBMS checks if any prior versions of that record are no longer needed. A version is obsolete if its transaction rolled back, or if there is a later committed version of the record and there are no active transactions with a transaction number less than the transaction number of the newer committed version. Versions that are obsolete are automatically deleted and the space they occupied in the database pages is reused.

Many rows in many databases are visited infrequently. To remove unnecessary versions of these rows, the database must be periodically "swept." A sweep operation visits every row in every table in the database and deletes outdated versions. You can run the sweep while the database is in use, but the sweep will impact performance while it's running.

InterBase, by default, will automatically start a sweep after 20,000 transactions. This isn't the best way to manage sweeping because you have no control over when the sweep will start. In addition, the user who starts the transaction that triggers the sweep is locked until the sweep finishes. It's better to periodically start a sweep manually when database use is low.

Conclusion

Selecting the correct database for your application requires a clear understanding of the types of transactions the system must process. Many applications today require a mixture of multi-row read transactions and updates. In this environment, versioning has a clear advantage because it can process read and write transactions concurrently while still providing repeatable read to ensure accuracy.

Versioning also provides rapid crash recovery because there's no log file to process. When a versioning database restarts, it simply marks all open but uncommitted transactions as rolled back, and it's ready to go.

As stated earlier, InterBase is the only DBMS to use the versioning model. In addition to the advantages of the versioning model, InterBase has the smallest disk and memory footprint (it ships on two diskettes), is self tuning, and runs on NetWare, Windows NT, and a wide variety of UNIX platforms. Therefore, InterBase is highly scalable.