2011. június 11., szombat

Internet Explorer Automation


Problem/Question/Abstract:

Internet explorer comes with windows, so is available on nearly every client machine of your users. It's many capabilities can be used from your delphi application. This article contains an introduction to this subject.

Answer:

Microsoft sells its windows product with its browser Internet Explorer. This browser, like all MS products, is COM based, so through its interface we can use this component. The component holds all core functionality of the browser, so this functionality is available from Delphi as well. Even better, the explorer can be put into edit mode, so you can use it to edit html pages as well.

Usage of Internet Explorer in your application may enhance its functionality by a considerable degree. Recently I encountered the wish to use IE automation on two separate occasions. The first occasion was at the office, where an email application needed to be enhanced with html display as an increasing amount of mails had no plain text, just html. The second occasion was when some of the senior members of our church had difficulty maintaining the church website: used to just using MS word, and never even having bothered with things like directories, learning download with ftp, editing html, and uploading again just was too much to master in a short time. An integrated ftp / html-editing program seemed like the ideal solution.

Development environment.

The first thing we will have to do is install our development environment. This article is written with Delphi 5 enterprise, and tested with Delphi 7 personal edition.

Start Delphi, select "Component - Import ActiveX Control". In the list, select "Microsoft Internet Controls (version 1.1)" and add it to a new or existing package. Delphi will generate a ShDocVw_TLB.pas file. In some instances,  the file will be called ShDocVw.pas, for reasosn which are not entirely clear. Use Windows explorer to locate this file on your hard disk. Installing this component will also add the WebBrowser component to the component palette's Internet tab. (Some friends reported it on their ActiveX tab). If you don't have the 'Microsoft Internet controls' in your list of active X controls, import it from ShDocVw.dll.

Another thing you will need is the mshtml type library. Search your pc for files named mshtml.pas or mshtml_tbl.pas. If you don't have them, import the type library (Project - Import type library - Microsoft html object library). If you don't see this one, search your pc for mshtml.tlb, and add this file to your project. If you can not find it - and Delphi 5 Enterprise edition seems to come with the IE component already installed, go again to "Component - Import ActiveX Control", select "Microsoft Html Object Library" and click 'create unit'. This type library is fairly large so its generation may take a while if you have an old CPU.

Loading a page

The first thing you will probably do is load an html page. Nothing is simpler than that. Create a new application, go to the Internet tab of your component palette. Create a button, and in the on click event create the code:
  
WebBrowser1.Navigate('c:\webdemo\demo1.html');

Now create this demo1.html with something like:

hi


Just to show it supports more than plain text, create a page demo2 like:

  hi

Underlinedbolditalic

This is truly MS internet explorer. So you can not just load a page from your hard disk, you can also load a page from the web. As an illustration, drop a TEdit component on the form and name it 'edtWebAddress'. Create a econd button, label it 'Load web page', and in the onclick event enter:

WebBrowser1.Navigate(edtWebAddress.text);
  
Run your app, and enter 'www.google.com' in the edit field. You will notice that you don't need to enter the http: before the url, inserting this before the name is part of the behaviour of the component, not of the shell app you know as Internet Explorer. All pages will be loaded and displayed as pages would be in Internet Explorer itself. This includes forms and javascript.

Navigating to a page is just one way of loading a page. You can also load it from stream. Add a new button to your form, label it 'Load from stream', and add:

var
  ms: TMemoryStream;
begin
  ms := TMemoryStream.Create;
  Tekst.SaveToStream(ms);
  ms.seek(0, 0);
  if WebBrowser1.Document <> nil then
    Result := (WebBrowser1.Document as
      IPersistStreamInit).Load(TStreamAdapter.Create(ms));
  ms.free;
end;

You will have to add the ActiveX file to your uses clause for the IPersistStreamInit declaration. The first time you start your app, no document will have been loaded. So Webbrowser1.document will be nil. Load a page or site page first, then run this code. This is exactly the reason for the if statement. There is a slight problem: under some circumstances (especially when you load a double byte coded page), loading from stream will show the html source instead of the intended layout. So generally you will want to navigate to a page instead.

Before we leave the Navigate2 command, lets allow ourselves a small digression. You may have heard that Microsoft integrated Internet Explorer and Windows explorer. Try for yourself the next command:
      
WebBrowser1.Navigate('c:\temp\');

Forward and back

IE, like every browser, has buttons for moving forward and back through the list of visited pages. The commands for these actions are so simple, that we hardly need to comment on them:

begin

  WebBrowser1.Back;

  WebBrowser1.Forward;

end;    

IE keeps track of the pages you have visited, you don't have to keep track of them yourself.

Printing a page

To print a page, once it has loaded, we can send a message OLECMDID_PRINT to the control interface. Add another button, declare two variables of type olevariant, and type the following code:
      
var
  vaIn, vaOut: OleVariant;
begin
  WebBrowser1.ControlInterface.ExecWB(OLECMDID_PRINT, OLECMDEXECOPT_DONTPROMPTUSER,
    vaIn, vaOut);
end;
    
Note that a document needs to have been loaded, else an access violation will occur.

Discovering busy

Some actions like loading a webpage or printing might take a while. Using the interface, you can see the moving graphics in the upper right corner as an indication that the browser is still busy. But how do you find if it's still busy in your program? The answer is provided by the ReadyState property. Add a label to your form, and add the following code to one of the previous buttons.
      
while (WebBrowser1.ReadyState <> ReadyState_Complete) do

begin

  Label1.caption := 'busy ..';

  Application.ProcessMessages;

end;

Label1.Caption := 'Ready';
    
Retrieving and setting the html code

Once we have loaded an html page, we might want to inspect the html code. One purpose might be to save it to file. Another purpose is if we want to build a dedicated html editor. The html resides in the IHtmldocuments, which is derived from IDispatch. We have to define a variable of the type IHTMLDocument2. This one is defined in the type library mentioned in the development environment paragraph above, and you have to include it in your uses clause.
  
var
  Doc: IHTMLDocument2;
  Html: string;
begin
  Doc := WebBrowser1.Document as IHTMLDocument2;
  Html := Doc.body.InnerHTML;
  ShowMessage('Innerhtml =' + Html);
  Html := Doc.body.OuterHTML;
  ShowMessage('Outerhtml =' + Html);
end;

The InnerHtml property can also be used to set the  contents of the page. Simply assign a new value to the Doc.Body.InnerHtml.

Another action you might be interested in is the retrieval of text selected by the user.

Clipboard activation

To use Ctrl-C and Ctrl-v, we need to use initialize and un-initialize Olehandling. Windows provides two apis, which we can call in the intialization and finalization sections:

initialization

  OleInitialize(nil);

finalization

  OleUninitialize;

Note that you will need to include the ActiveX unit in your uses clause.

Retrieving Head section

You may have noticed that when we retrieved the InnerHtml property, we did not get everything. All lines from the head section were missing. This also applied to the OuterHtml property, though according to many sources this property contains all the html. One way to obtain them would be to write the document to a file and read the file. But there is a faster and more direct way.

The document has a property all of the type IHtmlCollection. This property contains all the html elements, and we can simply loop through the collection.

var
  Doc: IHTMLDocument2;
  EllColl: IHTMLElementCollection;
  i: integer;
  Item: OleVariant;
begin
  Doc := WebBrowser1.Document as IHTMLDocument2;
  EllColl := Doc.all;
  for i := 0 to EllCOll.Length - 1 do
  begin
    Item := EllColl.item(i, varEmpty);
    ShowMessage(Item.tagname + '*contains*' + Item.InnerHtml);
  end;
end;      
  
The elements in this collection can also be manipulated. You could, for instance, loop through the collection, check for a certain type, and then replace the contents.

Editing

The previous paragraph introduced us to some possibilities to replace part or all of the html code with new content. But you may not always be interested in changing everything by hand. It may be more interested in letting your user do the job. The good news is that your users will be able to change content directly, without your interference. Simple set the design property of the document:
      
var
  Doc: IHTMLDocument2;
begin
  Doc := WebBrowser1.Document as IHTMLDocument2;
  Doc.designMode := 'On';
end;      
    
Another way to achieve the same result:
    
var
  Doc: IHTMLDocument2;
begin
  Doc := WebBrowser1.Document as IHTMLDocument2;
  Doc.body.setAttribute('contentEditable', 'true', 0);
end;      

After setting this property, your user will be able to edit the contents of the file directly. The user is even able to apply formats by pressing ctrl-b, ctrl-i and ctrl-u. So in effect, you have much of the functionality of MS Frontpage at your disposal. Of course you will have to write your own interface around it for loading and saving files.

Let's have a look at some of the stuff you might wish to use when writing your own html-editor.

We already remarked that your user can use ctrl-b to make the selected text bold, italic or underlined:. A nice feature, but you will probably want to provide your user with a menu option and a speedbutton to provide the same  functionality. The Document2 interface provides an 'execCommand' method, which enables us to do just that:

var
  Doc: IHTMLDocument2;
begin
  Doc := WebBrowser1.Document as IHTMLDocument2;
  Doc.execCommand('Underline', False, 0);
end;    

The second parameter, False in the above example, will prompt IE to present the user with a dialog if one is applicable (with the noticable exception of the saveAs command, which will always show a dialog!). The third parameter is an optional variant. It's possible values depend on the selected command.

Here is a list of supported commands:

2D-Position: Allows absolutely positioned elements to be moved by dragging.

AbsolutePosition : Sets an element's position   property to "absolute."

BackColor : Sets or retrieves the background color of the current selection.

Bold : Toggles the current selection between bold and nonbold.

ClearAuthenticationCache : Clears all authentication credentials from the  cache.

Copy : Copies the current selection to the clipboard.

CreateBookmark : Creates a bookmark anchor or retrieves the name of a bookmark anchor for the current selection or insertion point.

CreateLink : Inserts a hyperlink on the current selection, or displays a dialog box enabling the user to specify a URL to insert as a  hyperlink on the current selection.

Cut : Copies the current selection to the clipboard and then deletes it.

Delete : Deletes the current selection.

FontName : Sets or retrieves the font for the current selection.

FontSize : ets or retrieves the font size for the current selection.

ForeColor : Sets or retrieves the foreground (text) color of the current selection.

FormatBlock : Sets the current block format tag.

Indent : Increases the indent of the selected text by one indentation increment.

InsertButton : Overwrites a button control on the text selection.

InsertFieldset : Overwrites a box on the text selection.

InsertHorizontalRule : Overwrites a horizontal line on the text selection.

InsertIFrame : Overwrites an inline frame on the text selection.

InsertImage : Overwrites an image on the text selection.

InsertInputButton : Overwrites a button control on the text selection.

InsertInputCheckbox : Overwrites a check box control on the text selection.

InsertInputFileUpload : Overwrites a file upload control on the text selection.

InsertInputHidden : Inserts a hidden control on the text selection.

InsertInputImage : Overwrites an image control on the text selection.

InsertInputPassword : Overwrites a password control on the text selection.

InsertInputRadio : Overwrites a radio control on the text selection.

InsertInputReset : Overwrites a reset control on the text selection.

InsertInputSubmit : Overwrites a submit control on the text selection.

InsertInputText : Overwrites a text control on the text selection.

InsertMarquee : Overwrites an empty marquee on the text selection.

InsertOrderedList : Toggles the text selection between an ordered list and a   normal format block.

InsertParagraph : Overwrites a line break on the text selection.

InsertSelectDropdown : Overwrites a drop-down selection control on the text selection.

InsertSelectListbox : Overwrites a list box selection control on the text selection.

InsertTextArea : Overwrites a multiline text input control on the text selection.

InsertUnorderedList : Toggles the text selection between an ordered list and a  normal format block.

Italic : Toggles the current selection between italic and nonitalic.

JustifyCenter : Centers the format block in which the current selection is located.

JustifyLeft : Left-justifies the format block in which the current selection is located.

JustifyRight : Right-justifies the format block in which the current selection is located.

LiveResize : Causes the MSHTML Editor to update an element's appearance continuously during a resizing or moving operation, rather than updating only at the completion of the move or resize.

MultipleSelection : Allows for the selection of more than one element at a time when the user holds down the SHIFT or CTRL keys.

Outdent : Decreases by one increment the indentation of the format block in which the current selection is located.

OverWrite : Toggles the text-entry mode between insert and overwrite.

Paste : Overwrites the contents of the clipboard on the current selection.

Print : Opens the print dialog box so the user can print the current page.

Refresh : Refreshes the current document.

RemoveFormat : Removes the formatting tags from the current selection.

SaveAs : Saves the current Web page to a file.

SelectAll : Selects the entire document.

UnBookmark : Removes any bookmark from the current selection.

Underline : Toggles the current selection between underlined and not underlined.

Unlink : Removes any hyperlink from the current selection.

Unselect : Clears the current selection.

The Document2 interface not only provides us with a method execCommand to change the document, but also with the queryCommandState method which can tell us in what state the document is.
        
if Doc.queryCommandState('JustifyLeft') then
  ShowMessage('left');
  
will tell us of the text is left justified. Note that this function only results in true if the text has been justified left explicitly, if it has been justified left by default the result is false.

Every command has its own pecularities. This article would become too long to list them all, and most of them you will easily discover yourself.

Sources

Here are some sources for further study:
  
http://msdn.microsoft.com/library/default.asp?url=/workshop/browser/editing/editdesignerovw.asp#Tutorials

tells a lot about the way Microsoft designed the built in editor of IE. Note that Microsoft has the habit of a-periodically but frequently redesigning their msdn site. So the link may have moved by the time you read this.
  
http://bdn.borland.com/article/0,1410,26574,00.html

Borland introduction to Internet Explorer automation
  
http://groups.yahoo.com/group/delphi-webbrowser/

is a newsgroup with lots of info.
  
Delphi 5 Enterprise edition comes with a small demo program. You can find it in the Demoes\Coolstuf directory.

Nincsenek megjegyzések:

Megjegyzés küldése