Archive for the ‘MySQL’ Category
Posted by jpluimers on 2022/01/06
Contrary to what many believe is that MySQL utf8 is not always full blown UTF-8 support, but actually utf8mb3, which has been deprecated for a while now.
Only utf8mb4 will give you full blown UTF-8 support.
This when someone reminded me of this in a Delphi application:
When I insert :joy: emoji into mysql varchar filed I got an error :
#22007 Incorrect string value: '\xF0\x9F\x98\x82' for column 'remarks' at row 1
database charset is utf8
Note that the :joy: emoji is 😂 and has Unicode code point U+1F602 which is outside the basic multilingual plane.
See:
- [Wayback] Unicode Character ‘FACE WITH TEARS OF JOY’ (U+1F602)
- Plane (Unicode): Overview, Basic Multilingual Plane – Wikipedia
- [Archive.is] Kristian Köhntopp on Twitter: “MySQL also, for quite some time now, no longer updates its own charsets and collations internally, for the same reason. So utf8 in MySQL is utf8mb3, the three byte variant of Unicode UTF-8 implementation that covers only the BMP (unicode up to U+FFFF).”
- Kristian Köhntopp
»Where does PostgreSQL’s collation logic come from?
PostgreSQL relies on external libraries to order strings.
– libc, meaning the operating system locale facility (POSIX or Windows)
– icu, meaning the ICU project (if PostgreSQL was built with ICU support)«
- MySQL does things differently:
MySQL binary data files are independent of the host operating system in byte order, number representation (as long as the host fulfils MySQLs basic requirements), collation and even time zone handling.
- So MySQL implements collations internally, also to guarantee stability across OS updates.
If it didn’t, a libc update changing collations would mean you have to recreate a lot of indexes. Also, you would not be able to safely move data files from host to host.
- MySQL also, for quite some time now, no longer updates its own charsets and collations internally, for the same reason.
So utf8 in MySQL is utf8mb3, the three byte variant of Unicode UTF-8 implementation that covers only the BMP (unicode up to U+FFFF).
- When moving to fuller (multiplane) UTF-8 support, a new name was needed, and utf8mb4 was chosen.
So when you actually want modern utf8 in MySQL, you have to use utf8mb4, and now you know why.
- utf8 is deprecated and will be upgraded to utf8mb4 in some future MySQL release. This will be a breaking upgrade, and I wonder if it will require dropping and recreating all indexes affected by the change.
That will be painful.
- https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html …
utf8mb3 page in the MySQL 8.0 manual, with deprecation notice.
What will change is the meaning of the alias utf8 (currently an alias for utf8mb3).
- [Wayback] MySQL: Some Character Set Basics | Die wunderbare Welt von Isotopp
- [Wayback] MySQL :: MySQL 8.0 Reference Manual :: 10.9.2 The
utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding)
utf8 is an alias for utf8mb3; the character limit is implicit, rather than explicit in the name.
Note
The utf8mb3 character set is deprecated and you should expect it to be removed in a future MySQL release. Please use utf8mb4 instead. Although utf8 is currently an alias for utf8mb3, at some point utf8 is expected to become a reference to utf8mb4. To avoid ambiguity about the meaning of utf8, consider specifying utf8mb4 explicitly for character set references instead of utf8.
- [Wayback] MySQL :: MySQL 8.0 Reference Manual :: 10.9.1 The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding)
utf8mb4 contrasts with the utf8mb3 character set, which supports only BMP characters and uses a maximum of three bytes per character:
- For a BMP character,
utf8mb4 and utf8mb3 have identical storage characteristics: same code values, same encoding, same length.
- For a supplementary character,
utf8mb4 requires four bytes to store it, whereas utf8mb3 cannot store the character at all. When converting utf8mb3 columns to utf8mb4, you need not worry about converting supplementary characters because there are none.
–jeroen
Posted in Conference Topics, Conferences, Database Development, Delphi, Development, Encoding, Event, MySQL, Software Development, UTF-8, UTF8 | Leave a Comment »
Posted by jpluimers on 2021/07/08
For my link archive: [WayBack] Case sensitivity for SQL identifiers · ontop/ontop Wiki · GitHub:
- Oracle and H2 changes unquoted identifiers to uppercase.
- Although technically possible, Oracle explicitly recommends to not use lowercase identifers. We do not support H2 with the setting DATABASE_TO_UPPER=FALSE, if this setting is enabled all queries with names and tables in lowercase must be quoted.
- DB2 Names are not case sensitive.
- For example, the table names CUSTOMER and Customer are the same, but object names are converted to uppercase when they are entered. If a name is enclosed in quotation marks, the name becomes case sensitive. The schema name is case-sensitive, and must be specified in uppercase characters.
- Postgres changes unquoted identifiers (both columns and alias names) to lowercase.
- Mysql does not change the case of unquoted tables and schemas.
- It changes in lowercase the unquoted columns. Mysql tables are stored as files in the operating system the server runs on. This means that database and table names are not case sensitive in Windows, and case sensitive in most varieties of Unix or Linux. The backtick ` is used for enclosing identifiers such as table and column names.
- Mssqlserver All connection string property names are case-insensitive.
- For example, Password is the same as password. Identifiers of objects in a database, such as tables, views, and column names, are assigned the default collation of the database. For example, two tables with names that differ only in case can be created in a database that has case-sensitive collation, but cannot be created in a database that has case-insensitive collation. Default SQL Server is not case sensitive. SELECT * FROM SomeTable is the same as SeLeCT * frOM soMetaBLe. Delimited identifiers are enclosed in double quotation marks (“) or brackets ([]). Identifiers that comply with the rules for the format of identifiers may or may not be delimited.
–jeroen
Posted in Database Development, DB2, Development, MySQL, OracleDB, PostgreSQL, SQL Server | Leave a Comment »
Posted by jpluimers on 2021/02/18
[WayBack] showthedocs
is a documentation browser that finds the relevant docs for your code. It works by parsing the code and connecting parts of it to their explanation in the docs
, and supports these languages:
You can enter any language text, then click the language, followed by clicking the “SHOW ME THE DOCS!” button, for which an example is further below.
The site has an open architecture, allowing to plug in more languages and documentation:
gitconfig example
So for instance the below ./git/config file leads to this result [WayBack] where you can click on all the coloured areas for easy navigation through the documentation:
Read the rest of this entry »
Posted in *nix, *nix-tools, Database Development, Development, DVCS - Distributed Version Control, git, MySQL, nginx, PostgreSQL, Power User, Software Development | Leave a Comment »
Posted by jpluimers on 2019/10/30
I could not find any vendors/architectures have public database servers.
So there is no good way to go beyond SQLFiddle (of which I wrote before in SQL Fiddle | A tool for easy online testing and sharing of database problems and their solutions and David Rodriguez: a few nice posts on SQL (via: Google+)), that does not provide database access, but allows you to fire SQL statements onto these architectures:
- MySQL 5.6
- Oracle 11g R2
- PostgreSQL 9.6
- PostgreSQL 9.3
- SQLite (WebSQL)
- SQLite (SQL.js)
- MS SQL Server 2017
I get the thing (it is very hard to secure an “over the internet” connection to a database server; do NOT do this: [WayBack] connectivity – Connect to SQL Server over Internet – Database Administrators Stack Exchange), so the alternative is to run locally.
If you run locally, there are plenty of example/demo database, like:
–jeroen
Posted in Database Development, Development, MySQL, OracleDB, PostgreSQL, SQL, SQL Server, SQLite | Leave a Comment »
Posted by jpluimers on 2019/08/29
Via [WayBack] SQL select only rows with max value on a column, I bumped into http://sqlfiddle.com/#!9/a6c585/1:
Application for testing and sharing SQL queries.
Source: [WayBack] SQL Fiddle | A tool for easy online testing and sharing of database problems and their solutions.
It is a cool site, currently supporting these SQL back-ends:
- MySQL 5.6
- Oracle 11g R2
- PostgreSQL 9.6
- PostgreSQL 9.3
- SQLite (WebSQL)
- SQLite (SQL.js)
- MS SQL Server 2014
You can host it yourself using [WayBack] GitHub – zzzprojects/sqlfiddle2: New version of SQL Fiddle based on OpenIDM (in the past it was [WayBack] GitHub – zzzprojects/sqlfiddle)
Other resources for learning and playing around with SQL:
–jeroen
Posted in Database Development, Development, MySQL, OracleDB, PostgreSQL, Software Development, SQL, SQL Server | Leave a Comment »
Posted by jpluimers on 2018/06/27
Reminder to self, as yet another client insisted this was possible, but in the past it wasn’t reliably possible, some links from Kristian Köhntopp:
–jeroen
Posted in Database Development, Development, MySQL | 2 Comments »
Posted by jpluimers on 2018/06/26
I need to write up some notes, but there are some links that will help me:
It’s a question of readability. There is no difference in performance.
Old versions of SQL Server were silly enough to look up meta data, but not any more.
SELECT foo FROM bar WHERE EXISTS (SELECT * FROM baz WHERE baz.id = bar.id);
SELECT foo FROM bar WHERE EXISTS (SELECT 1 FROM baz WHERE baz.id = bar.id);
I am not considering NULL or “fun variants” which don’t seem intuitive to me.
SELECT foo FROM bar WHERE EXISTS (SELECT NULL FROM baz WHERE baz.id = bar.id);
SELECT foo FROM bar WHERE EXISTS (SELECT 1/0 FROM baz WHERE baz.id = bar.id);
The question popped up in comments just now. I researched the manuals of the most popular RDBMS:
A search on SO for code:"EXISTS (SELECT 1" yields 5,048 results.
A search on SO for code:"EXISTS (SELECT *" yields 5,154 results.
Updated links and counts 07.2015.
So SELECT * has the popular vote and the big commercial RDBMS on its side.
I find SELECT 1 more intuitive. It’s like saying “if at least one exists”.
Is SELECT * more intuitive?
–jeroen
Posted in Database Development, Development, Firebird, InterBase, MySQL, PostgreSQL, SQL, SQL Server | Leave a Comment »
Posted by jpluimers on 2017/08/02
I thought I posted a reference to this a long time ago, but didn’t.
It’s one of the things I show when explaining joins to people. Sometimes I need it myself too (:
The article explains these in greater detail:
- INNER JOIN
- LEFT JOIN
- RIGHT JOIN
- OUTER JOIN
- LEFT JOIN EXCLUDING INNER JOIN
- RIGHT JOIN EXCLUDING INNER JOIN
- OUTER JOIN EXCLUDING INNER JOIN
Note:
- the opposite of INNER JOIN is not OUTER JOIN. It’s OUTERJOIN EXCLUDING INNER JOIN
- the opposite of OUTER JOIN is empty set.
But the diagram is usually speaks for itself.
–jeroen
Source: Visual Representation of SQL Joins – CodeProject
Read the rest of this entry »
Posted in Access, Database Development, DB2, Development, Firebird, InterBase, MySQL, OracleDB, PostgreSQL, SQL, SQL Server | Leave a Comment »