2004. október 5., kedd

How to convert extended characters into their HTML character entities


Problem/Question/Abstract:

I just need a routine that scans an ordinary string and the replaces all occurrences of '<', '&' and all other illegal characters by the correct HTML symbol.

Answer:

In Delphi 7 (HTTPApp.pas) you have:

function HTMLEncode(const AStr: string): string;
const
  Convert = ['&', '<', '>', '"'];
var
  Sp, Rp: PChar;
begin
  SetLength(Result, Length(AStr) * 10);
  Sp := PChar(AStr);
  Rp := PChar(Result);
  while Sp^ <> #0 do
  begin
    case Sp^ of
      '&':
        begin
          FormatBuf(Rp^, 5, '&amp;', 5, []);
          Inc(Rp, 4);
        end;
      '<', '>':
        begin
          if Sp^ = '<' then
            FormatBuf(Rp^, 4, '&lt;', 4, [])
          else
            FormatBuf(Rp^, 4, '&gt;', 4, []);
          Inc(Rp, 3);
        end;
      '"':
        begin
          FormatBuf(Rp^, 6, '&quot;', 6, []);
          Inc(Rp, 5);
        end;
    else
      Rp^ := Sp^
    end;
    Inc(Rp);
    Inc(Sp);
  end;
  SetLength(Result, Rp - PChar(Result));
end;

which is pretty good. It will use quite a bit of memory on long input strings, though. For some reason it sets the result buffer to 10 times the length of the input buffer when 6 times would have been enough (for the worst case, all quote chars (").

I rolled my own before D7 was released (part of my WOS framework - found on the D7 Companion CD):

function HTMLEncode(const S: string): string;
const
  ConversionSet = ['&', '<', '>', '"', '+']; {The '+' is because of a IE bug}
  ConversionChars: PChar = '&<>"+';
  Entities: array[1..5] of string = ('&amp;', '&lt;', '&gt;', '&quot;', '&#43;');
var
  Sp, Rp: PChar;
  P: integer;
begin
  SetLength(Result, Length(S) * 6); {Ouch... ( worst case is all "'s )}
  Sp := PChar(S);
  Rp := PChar(Result);
  while Sp^ <> #0 do
  begin
    if not (Sp^ in ConversionSet) then
    begin
      Rp^ := Sp^;
      Inc(Rp);
    end
    else
    begin
      P := StrScan(ConversionChars, Sp^) - ConversionChars + 1;
      StrCopy(RP, PChar(Entities[P]));
      Inc(Rp, Length(Entities[P]));
    end;
    Inc(Sp);
  end;
  SetLength(Result, Rp - PChar(Result));
end;

Nincsenek megjegyzések:

Megjegyzés küldése