The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

    • RT @WheelieNick: Wat me nu het meeste zeer doet is dat mijn gezonde jaren zijn afgepakt. Juist toen had ik de dingen kunnen doen die ik wil… 35 minutes ago
    • RT @MaaikeBallieux: Draadje 🧵 Over rode en groene handjes Zoonlief Bram zit op de bank. Hij wil ons iets vertellen, wat 'heel erg is'. Hi… 36 minutes ago
    • RT @WheelieNick: Wanneer wordt het nou eens bekend wanneer pubers hun booster kunnen krijgen? 39 minutes ago
    • RT @samgerrits: Precies 2 jaar terug sloeg ik alarm over SARS 2.0, (compromis China & W.H.O -> #COVID19). Op basis van dit grafiekje schat… 42 minutes ago
    • RT @locuta: LET OP: Als je 08007070 belt voor het maken van een vaccinatie-afspraak voor een kind, dan krijgen sommige mensen een vraag naa… 44 minutes ago
  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,570 other followers

parsing – delphi – strip out all non standard text characers from string – Stack Overflow

Posted by jpluimers on 2021/12/02

From a while back a totally non-optimised code example by me (intentionally limiting to AnsiStr as it was about filtering ASCII, and UniCode has way many code points for the Latin script).

// For those who need a disclaimer: 
// This code is meant as a sample to show you how the basic check for non-ASCII characters goes
// It will give low performance with long strings that are called often.
// Use a TStringBuilder, or SetLength & Integer loop index to optimize.
// If you need really optimized code, pass this on to the FastCode people.
function StripNonAsciiExceptCRLF(const Value: AnsiString): AnsiString;
var
  AnsiCh: AnsiChar;
begin
  for AnsiCh in Value do
    if (AnsiCh >= #32) and (AnsiCh <= #127) and (AnsiCh <> #13) and (AnsiCh <> #10) then
      Result := Result + AnsiCh;
end;

and an optimised one by [WayBack] David Heffernan

function StrippedOfNonAscii(const s: string): string;
var
  i, Count: Integer;
begin
  SetLength(Result, Length(s));
  Count := 0;
  for i := 1 to Length(s) do begin
    if ((s[i] >= #32) and (s[i] <= #127)) or (s[i] in [#10, #13]) then begin
      inc(Count);
      Result[Count] := s[i];
    end;
  end;
  SetLength(Result, Count);
end;

Even when “trivial”, I usually do not prematurely optimise as optimised code is almost always less readable than non-optimised code.

Source: [Wayback] parsing – delphi – strip out all non standard text characers from string – Stack Overflow

–jeroen

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
%d bloggers like this: