Delphi TRegExOption: Where is description of roNotEmpty option? What does this option do? – Jacek Laskowski – Google+
Posted by jpluimers on 2020/12/10
I really dislike using regular expressions, mainly because every time I bump into code using them either:
- I cannot decipher them any more
- It is used for things not suites for (like parsing JSON or XML: please don’t!)
For more background on when NOT to use regular expressions, remember they describe a regular grammar, and can only me implemented by a finite state machine (a state machine that can be exactly one state out of a set of finite states).
As soon as you need to parse something that needs multiple states at once, or the number of states becomes infinite,
Some background reading:
- [WayBack] Regex for parsing single key: values out of JSON in Javascript – Stack Overflow
- [WayBack] Why is it such a bad idea to parse XML with regex? – Stack Overflow
- [WayBack] html – RegEx match open tags except XHTML self-contained tags – Stack Overflow
- I think the flaw here is that HTML is a Chomsky Type 2 grammar (context free grammar) and RegEx is a Chomsky Type 3 grammar (regular grammar). Since a Type 2 grammar is fundamentally more complex than a Type 3 grammar (see the Chomsky hierarchy), it is mathematically impossible to parse XML with RegEx.But many will try, some will even claim success – but until others find the fault and totally mess you up.
- [WayBack] Exploring the Linguistics Behind Regular Expressions : How a linguistic breakthrough ended up in code
- Regular grammars, which retain no past state knowledge from input string to output string
- [WayBack] Using Finite State Automata to Implement W3C XML Schema Content Model Validation and Restriction Checking (leading to algorithms using requiring either multiple Finite State Automatons, and/or are polynomial in time and/or are exponential in space)
- [WayBack] Which formal language class are XML and JSON with unique keys? – Theoretical Computer Science Stack Exchange
- [WayBack] Is JSON a Regular Language? – Theoretical Computer Science Stack Exchange
- Regular expression – Wikipedia
- Regular language – Wikipedia
- Regular grammar – Wikipedia
- Finite-state machine – Wikipedia
Chomsky hierarchy Grammars Languages Abstract machines
- Type-0
- —
- Type-1
- —
- —
- —
- —
- —
- Type-2
- —
- —
- Type-3
- —
- —
Each category of languages, except those marked by a *, is a proper subset of the category directly above it.Any language in each category is generated by a grammar and by an automaton in the category in the same line.
But it helps knowing that stuff means in the various environments you use, so [WayBack] Delphi TRegExOption: Where is description of roNotEmpty option? What does this option do? – Jacek Laskowski – Google+:
In
TregEx.Create
:
if (roNotEmpty in Options) then
FRegEx.State := [preNotEmpty];
preNotEmpty
is aTPerlRegExState
which is defined as:
TPerlRegExState = set of (
preNotBOL, // Not Beginning Of Line: ^ does not match at the start of Subject
preNotEOL, // Not End Of Line: $ does not match at the end of Subject
preNotEmpty // Empty matches not allowed
);
So
roNotEmpty
seems to indicate that empty matches are not allowed.
–jeroen
Leave a comment