2009. január 11., vasárnap
Count the lines of text contained in a text file
Problem/Question/Abstract:
How to count the lines of text contained in a text file
Answer:
Solve 1:
The fastest way would be to count the instances of #13#10 yourself. However you need to be careful because #13 and #10 could easily be swapped to give #10#13 instead which makes this kind of counting more difficult. In this case it's far easier just to count the instances of one of them and this has the bonus of being more compatible with non-Windows (ie. non CR/LF'd) files - not all operating systems bother with both #13 and #10. The following is a basic implementation of the code:
function CountLines(const FileName: string): integer;
const
BufferSize = 1024;
SearchByte = 10;
var
FileHandle, BytesRead, Index: integer;
Buffer: array[1..BufferSize] of byte;
begin
FileHandle := FileOpen(FileName, fmOpenRead or fmShareDenyWrite);
BytesRead := FileRead(FileHandle, Buffer[1], BufferSize);
if (BytesRead > 0) then
Result := 1
else
Result := 0;
repeat
for Index := 1 to Min(BufferSize, BytesRead) do
begin
if (Buffer[Index] = SearchByte) then
Inc(Result);
end;
BytesRead := FileRead(FileHandle, Buffer[1], BufferSize);
until
BytesRead <= 0;
FileClose(FileHandle);
end;
This code is searching for #10's in the file, and treating this as a line delimeter. It takes care of the case where an empty file has 0 lines but a file with no #10s has one line in the initialisation of the Result return value. You can easily modify the seach byte and/or the buffer size.
Solve 2:
If it is a smaller file (< 1 MB) load it into a TStringlist and look at the stringlists Count property. If it is larger you need to read it completely and count lines. A simple loop would be this:
function CountLines(const filename: string): Integer;
var
buffer: array[0..4095] of Char;
f: Textfile;
begin
Result := 0;
Assignfile(f, filename);
Reset(f);
try
SetTextBuffer(f, buffer, sizeof(buffer));
while not Eof(f) do
begin
readLn(f);
Inc(result);
end;
finally
Closefile(f);
end;
end;
Using a larger than the default buffer of 128 bytes speeds the reading somewhat.
Solve 3:
Buffering can help quit a bit:
function TextLineCount_BufferedStream(const Filename: TFileName): Integer;
const
MAX_BUFFER = 1024 * 1024;
var
oStream: TFileStream;
sBuffer: string;
iBufferSize: Integer;
iSeek: Integer;
bCarry: Boolean;
begin
Result := 0;
bCarry := False;
oStream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyWrite);
try
SetLength(sBuffer, MAX_BUFFER);
repeat
iBufferSize := oStream.Read(sBuffer[1], MAX_BUFFER);
if iBufferSize <= 0 then
break;
{Skip LFs that follow a CR - even if it falls in seperate buffers}
iSeek := 1;
if bCarry and (sBuffer[1] = #10) then
Inc(iSeek);
while iSeek <= iBufferSize do
begin
case sBuffer[iSeek] of
#10:
Inc(Result);
#13:
if iSeek = iBufferSize then
Inc(Result)
else if sBuffer[iSeek + 1] <> #10 then
Inc(Result)
else
begin
Inc(Result);
Inc(iSeek);
end;
end;
Inc(iSeek);
end;
{Set carry flag for next pass}
bCarry := (sBuffer[iBufferSize] = #13);
until
iBufferSize < MAX_BUFFER;
finally
FreeAndNil(oStream);
end;
end;
Feliratkozás:
Megjegyzések küldése (Atom)
Nincsenek megjegyzések:
Megjegyzés küldése