Delphi: Use TStrings to parse non-standard separated strings, and validate it with DUnit tests
Posted by jpluimers on 2010/09/08
Recently, I was at a client where in a project strings had to be split from:
'FI-150 1U; FI-049-I L=20 MM;LET OP LASVORM'
Into:
'FI-150 1U'
'FI-049-I L=20 MM'
'LET OP LASVORM'
At first sight, this looks simple: Semicolon Separated Values and you are done.
Not so fast Mr Smart Alec: watch the optional spaces!
The best thing for problems like these is to start with an empty implementation that some units tests covering it.
I use DUnit for Delphi unit testing.
Unit testing should go with code coverage, but there are few Delphi code coverage articles.
I’ll get into code coverage later on, as I’m working with two of the code coverage people to get this to work nicely with Delphi 2010.
Mock objects can be a good addition to unit testing too, so in a future article, I will cover using mock objects with Delphi.
All code will be in a DUnit project (as that is easier to setup for example purposes).
In practice you will probably have your business logic in a library, and just add the business units to your DUnit test projects.
Note:
in my projects, I usually have the directories like prj, src, bin and lib (for project, sources, binaries and .dcu files).
In the DUnit project, lets start with the frame for the implementation unit.
This frame defines the interface how to call the logic, and serves as a starting point to generate the unit test from.
unit SCSVSplitterUnit; interface uses Classes; type TSCSVSplitter = class(TObject) public procedure Split(const SCSV: string; const Strings: TStrings); end; implementation uses SysUtils; procedure TSCSVSplitter.Split(const SCSV: string; const Strings: TStrings); begin end; end.
And here is the test:
unit TestSCSVSplitterUnit; interface uses TestFramework, Classes, SCSVSplitterUnit; type TestTSCSVSplitter = class(TTestCase) strict private FSCSVSplitter: TSCSVSplitter; FStrings: TStrings; strict protected property Strings: TStrings read FStrings; public procedure SetUp; override; procedure TearDown; override; published procedure TestSplit; end; implementation procedure TestTSCSVSplitter.SetUp; begin FSCSVSplitter := TSCSVSplitter.Create; FStrings := TStringList.Create(); end; procedure TestTSCSVSplitter.TearDown; begin FStrings.Free; FStrings := nil; FSCSVSplitter.Free; FSCSVSplitter := nil; end; procedure TestTSCSVSplitter.TestSplit; begin FSCSVSplitter.Split('FI-150 1U; FI-049-I L=20 MM;LET OP LASVORM', Strings); Self.CheckEquals(3, Strings.Count); Self.CheckEquals('FI-150 1U', Strings[0]); Self.CheckEquals('FI-049-I L=20 MM', Strings[1]); Self.CheckEquals('LET OP LASVORM', Strings[2]); end; initialization RegisterTest(TestTSCSVSplitter.Suite); end.
This covers the above examples in one test.
You can create multiple test methods for that if you want (for more complex tests they get easier to manage).
Note that if you do, each test method needs to be in the published section (so the test framework automatically sees it).
In addition to that, the test methods cannot have parameters.
For each run of each test method, the SetUp and TearDown are being called.
In the test results, you see it will fail at the first check (click on the image to enlarge)
Now you can step by step refine your implementation until all tests succeed.
The first steps to solve this are these:
- Use the semicolon (;) as a Delimiter to start splitting.
- Set the StrictDelimiter property to True (this will stop splitting on spaces and other characters
<=
#32
).
If you don’t, then TStrings will use spaces as a delimiter too, and you split across too many lines. - Use the DelimitedText property to make the TStrings split it automatically into the indexed Strings property.
So this is the new code (only the method implementation, the rest of the unit stays the same):
... procedure TSCSVSplitter.Split(const SCSV: string; const Strings: TStrings); var Index: Integer; begin Strings.Delimiter := ';'; Strings.StrictDelimiter := True; Strings.DelimitedText := SCSV; // the next two lines are not needed per se, I used it while debugging to see the individual split strings: for Index := 0 to Strings.Count - 1 do Strings[Index] := Strings[Index]; end; ...
What you see now is that it still fails (click on the image to enlarge):
The reason it fails is that one of the splitted strings contains leading spaces.
This kind of splitting is indeed strange, but that is what you get when writing software: all the exceptions make a programmers’ life interesting :-)
So the final unit has only one line changed: add a Trim.
This looks slow, and it probably is.
But this code isn’t running a million times, so it does not need premature optimization.
unit SCSVSplitterUnit; interface uses Classes; type TSCSVSplitter = class(TObject) public procedure Split(const SCSV: string; const Strings: TStrings); end; implementation uses SysUtils; procedure TSCSVSplitter.Split(const SCSV: string; const Strings: TStrings); var Index: Integer; begin Strings.Delimiter := ';'; Strings.StrictDelimiter := True; Strings.DelimitedText := SCSV; for Index := 0 to Strings.Count - 1 do Strings[Index] := Trim(Strings[Index]); end; end.
Now you can see the unit tests succeed: all the nodes in the tree are getting a green mark.
This unit with test case can now serve as a starting point for more tests:
- Each time a bug- or feature-request comes in, you add a test method for each
- You refine the code until all tests pass
- If your code changes break earlier tests, you see that at once, so you have form of regression
Hope this sheds some light on unit testing.
–jeroen
Heinz Z. said
Hello,
there should be more DUnit-Test in the World. :-)
For me, the Splitt-Method has some side effect. Try this test (untested :-) )
jpluimers said
Cool test; it needs string and boolean constants though ;->
–jeroen
Heinz Z. said
>> it needs string and boolean constants though ;->
hm…, I don’t get the point. Using constants in Unit-Test is normal (you do the same in your test). So why you mention it?
jpluimers said
You missed the smiley.
My point before the smiley was that normally, in code, I use constants for things that I repeat.
Actually, you had a very good point: it is important to make sure that invariants should not vary.
–jeroen
Oliver Giesen said
Looking forward to your article on using mock objects. It’s something I’ve been puzzling over to get right for some time now (also see my StackOverflow postings on that topic, e.g. http://stackoverflow.com/questions/2874669 or http://stackoverflow.com/questions/3448121).
Are you using PascalMock?
Cheers,
Oliver
P.S.: See you soon in Berlin?
jpluimers said
I won’t be in Berlin during the Delphi-Tage.
Since I already submitted for EKON 14, and Delphi Tage was originally planned the weekend before EKON, and my marching band is preparing for Tattoo on Stage in Lucerne the weekend after EKON, I simply could not make time, so did not submit proposals.
Preparation for Tattoo on Stage and the autumn season of the marching band very much limits my time during and around weekends anyway, so maybe next year’s Delphi Tage has an easier timing for me.
But if you are going to EKON 14, we can drink a beer there :-)
–jeroen
Fred said
Hi,
nice post with simple use of DUnit.
Just a question: what’s the purpose of
Strings[Index] := Strings[Index];
before you know you would need to Trim ?jpluimers said
I used that part for debugging purposes: to see the individual
splittedsplit strings.Thanks for the remark; I have added a comment to the code to make that piece more clear.
–jeroen
PS: Just learned something new too: split is an irregular verb, so splitted does not exist in the English language