The hilarious answer on Stack Overflow in why not to parse html with RegEx
Posted by jpluimers on 2011/02/09
Quite a while ago, user bobince wrote great answer on why not to parse html with RegEx.
Somehow people fail to recognize the brilliance of the answer, and try to simplify it into something like “don’t, use an XML or HTML parser in stead”.
bobince even posted some nice contra-examples that are impossible to parse in RegEx (heck, even most regular HTML and XML parsers have difficulties with them).
So: enjoy the beauty of the answer while it is still locked for editing.
–jeroen
mijnalbum.nl URLs and downloading pictures « The Wiert Corner – irregular stream of Wiert stuff said
[…] I am not a fan of using Regular Expressions for parsing general HTML, the thumbnail frame is generated in a very consistent way, so in this case I don’t mind […]