I should have had the below answer when writing about StUF – receiving data from a provider where UTF-8 is in fact ISO-8859.
A while ago, a co-worker did not believe when I told that default XML encoding really is UTF-8
(and tried to force it to utf-8
), and that if the content had byte sequences different from the (either specified or default) encoding, it was a problem.
I though I blogged about the default, and where to find it, but apparently, I did not.
My blog had (and has <g>) a truckload of articles mentioning UTF-8, less articles containing UTF-8, encoding and xml, but the ones having UTF-8, default, encoding and xml did not actually tell about a standard that really defines XML uses UTF-8 as default encoding when there is no other encoding information – like BOM (byte order mark), HTTP, or MIME encoding) available.
W3C indeed specifies it. [WayBack] utf 8 – How default is the default encoding (UTF-8) in the XML Declaration? – Stack Overflow has a summary (thanks James Holderness!):
The Short Answer
Under the very specific circumstances of a UTF-8 encoded document with no external encoding information (which I understand from the comments is what you’re interested in), there is no difference between the two declarations.
The long answer is far more interesting though.
and an elaboration: