parsing – delphi – strip out all non standard text characers from string – Stack Overflow

Posted by jpluimers on 2021/12/02

From a while back a totally non-optimised code example by me (intentionally limiting to AnsiStr as it was about filtering ASCII, and UniCode has way many code points for the Latin script).

// For those who need a disclaimer: 
// This code is meant as a sample to show you how the basic check for non-ASCII characters goes
// It will give low performance with long strings that are called often.
// Use a TStringBuilder, or SetLength & Integer loop index to optimize.
// If you need really optimized code, pass this on to the FastCode people.
function StripNonAsciiExceptCRLF(const Value: AnsiString): AnsiString;
  AnsiCh: AnsiChar;
  for AnsiCh in Value do
    if (AnsiCh >= #32) and (AnsiCh <= #127) and (AnsiCh <> #13) and (AnsiCh <> #10) then
      Result := Result + AnsiCh;

and an optimised one by [WayBack] David Heffernan

function StrippedOfNonAscii(const s: string): string;
  i, Count: Integer;
  SetLength(Result, Length(s));
  Count := 0;
  for i := 1 to Length(s) do begin
    if ((s[i] >= #32) and (s[i] <= #127)) or (s[i] in [#10, #13]) then begin
      Result[Count] := s[i];
  SetLength(Result, Count);

Even when “trivial”, I usually do not prematurely optimise as optimised code is almost always less readable than non-optimised code.

Source: [Wayback] parsing – delphi – strip out all non standard text characers from string – Stack Overflow


