UTF-8 web adoption is huge, closing 100%, but only soured up since around 2006.
Posted by jpluimers on 2022/02/08
As a precursor to a post tomorrow showing that serving UTF8 does not mean organisations go without unicode problems, first some statistics.
The first Unicode ideas got drafted some 30 years ago in 1987. In 1991, more than 30 years ago, the Unicode Consortium saw the light. Nowadays more than 95% percent of the web-pages (close to 100% when you include plain ASCII) is served using the UTF-8 encoding.
It means that nowadays there is a very small chance you
will see mangled characters (what Japanese call mojibake) when you’re surfing the web.
Some nice graphs of unicode growth are at these locations are at these locations:
- Popularity of text encodings – Wikipedia
- [Wayback] W3C: Who uses Unicode?
- [Archive.is] Web Technologies Statistics and Trends: W3Techs shows statistics and trends in the usage statistics of web technologies
- 2008: [Wayback] utf-8 Growth On The Web | W3C Blog
- 2012: [Wayback] Official Google Blog: Unicode over 60 percent of the web
- 2012: Archive.is Usage Statistics of Character Encodings for Websites, May 2012
- 2015: [Wayback] UTF-8 Unicode vs. other encodings over time | Pinyin News
- 2020: Archive.is Usage Statistics and Market Share of Character Encodings for Websites, August 2020
- 2010-2021: [Archive.is] Historical yearly trends in the usage statistics of character encodings for websites, June 2021: from 50% UTF-8 in 2010, to almost 97% mid 2021 (where the second place ISO-8859-1 at just 1.3%, so leaving less than 1.5% for all other encodings, see [Archive.is] Usage Statistics and Market Share of Character Encodings for Websites, June 2021)
I think especially important are 2008 (when UTF-8 had outgrown all other individual encodings) and slightly after 2010, when UTF-8 alone covered more than 50% of the pages served. These exclude ASCII-only pages. Adding those would make the figures even larger.
–jeroen
Leave a Reply