2011. március 4., péntek

Saving raw HTML source from TWebBrowser to disk


Problem/Question/Abstract:

How to save raw HTML source from TWebBrowser.Document to disk

Answer:

Solve 1:

TWebBrowser.Document implements IPersistStreamInit which exposes Save() method. All you need to know is how to use this method along with given object which implements IStream. We could simply use TStreamAdapter for this purpose.

Note that IPersistStreamInit and IStream interfaces are declared inside ActiveX unit.

Here's how to do it.

uses ActiveX...
  {...}
procedure TForm1.SaveHTMLSourceToFile(const FileName: string;
  WB: TWebBrowser);
var
  PersistStream: IPersistStreamInit;
  FileStream: TFileStream;
  Stream: IStream;
  SaveResult: HRESULT;
begin
  PersistStream := WB.Document as IPersistStreamInit;
  FileStream := TFileStream.Create(FileName, fmCreate);
  try
    Stream := TStreamAdapter.Create(FileStream, soReference) as IStream;
    SaveResult := PersistStream.Save(Stream, True);
    if FAILED(SaveResult) then
      MessageBox(Handle, 'Fail to save HTML source', 'Error', 0);
  finally
    { we are passing soReference in TStreamAdapter constructor,
      it is our responsibility to destroy the TFileStream object. }
    FileStream.Free;
  end;
end;

procedure TForm1.Button1Click(Sender: TObject);
begin
  if SaveDialog1.Execute then
    SaveHTMLSourceToFile(SaveDialog1.FileName, WebBrowser1);
end;

Here's the snippet code to navigate to password protected URL. The authorization type is Basic. You can search the internet for Base64 encode routine. There is plenty of it. I think this issue is beyond the article topic.

Actually, you can embed the Authorization info in the URL string in form http://<user>:<password>@<hostname>/ and IE will automatically put it in the request header for you.

procedure TForm1.Button1Click(Sender: TObject);
var
  URL, Flags, TargetFrameName, PostData,
    Headers: OleVariant;
begin
  // EdURL, EdPassword, and EdUserName is TEdit control
  URL := EdURL.Text;
  Flags := EmptyParam;
  TargetFrameName := EmptyParam;
  PostData := EmptyParam;
  if (EdUserName.Text <> '') and (EdPassword.Text <> '') then
    Headers := 'Authorization: Basic ' +
      Base64Encode(EdUserName.Text + ':' + EdPassword.Text)
  else
    Headers := EmptyParam;
  WebBrowser1.Navigate2(URL, Flags, TargetFrameName, PostData,
    Headers);
end;

Please visit

IE & Delphi site http://www.euromind.com/iedelphi


Component Download: HTMLSrcToFile.zip


Solve 2:

There is an easier way to perform this, see below:

try
  WebBrowser1.ExecWB(4, 0);
except
  on E: Exception do
    msError := true;
end;

This will import not only the raw HTML but also import all file dependencies, such as graphic files.  This method will present the user with a save as dialog box and a GUI representing the movement of the files (Standard AVI File).

Nincsenek megjegyzések:

Megjegyzés küldése