January 2022
M	T	W	T	F	S	S
	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Archive for January 6th, 2022

When MySQL characterset ‘utf’ does not allow you to enter some Unicode code points

Posted by jpluimers on 2022/01/06

Contrary to what many believe is that MySQL utf8 is not always full blown UTF-8 support, but actually utf8mb3, which has been deprecated for a while now.

Only utf8mb4 will give you full blown UTF-8 support.

This when someone reminded me of this in a Delphi application:

When I insert :joy: emoji into mysql varchar filed I got an error :
#22007 Incorrect string value: '\xF0\x9F\x98\x82' for column 'remarks' at row 1

database charset is utf8

Note that the :joy: emoji is 😂 and has Unicode code point U+1F602 which is outside the basic multilingual plane.

See:

[Wayback] Unicode Character ‘FACE WITH TEARS OF JOY’ (U+1F602)
Plane (Unicode): Overview, Basic Multilingual Plane – Wikipedia
[Archive.is] Kristian Köhntopp on Twitter: “MySQL also, for quite some time now, no longer updates its own charsets and collations internally, for the same reason. So utf8 in MySQL is utf8mb3, the three byte variant of Unicode UTF-8 implementation that covers only the BMP (unicode up to U+FFFF).”
- Kristian Köhntopp
  ‏
  
  »Where does PostgreSQL’s collation logic come from?
  PostgreSQL relies on external libraries to order strings.
  – libc, meaning the operating system locale facility (POSIX or Windows)
  – icu, meaning the ICU project (if PostgreSQL was built with ICU support)«
- MySQL does things differently:
  MySQL binary data files are independent of the host operating system in byte order, number representation (as long as the host fulfils MySQLs basic requirements), collation and even time zone handling.
- So MySQL implements collations internally, also to guarantee stability across OS updates.
  If it didn’t, a libc update changing collations would mean you have to recreate a lot of indexes. Also, you would not be able to safely move data files from host to host.
- MySQL also, for quite some time now, no longer updates its own charsets and collations internally, for the same reason.
  So utf8 in MySQL is utf8mb3, the three byte variant of Unicode UTF-8 implementation that covers only the BMP (unicode up to U+FFFF).
- When moving to fuller (multiplane) UTF-8 support, a new name was needed, and utf8mb4 was chosen.
  So when you actually want modern utf8 in MySQL, you have to use utf8mb4, and now you know why.
- utf8 is deprecated and will be upgraded to utf8mb4 in some future MySQL release. This will be a breaking upgrade, and I wonder if it will require dropping and recreating all indexes affected by the change.
  That will be painful.
- https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html …
  utf8mb3 page in the MySQL 8.0 manual, with deprecation notice.
  What will change is the meaning of the alias utf8 (currently an alias for utf8mb3).
[Wayback] MySQL: Some Character Set Basics | Die wunderbare Welt von Isotopp
[Wayback] MySQL :: MySQL 8.0 Reference Manual :: 10.9.2 The utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding)

utf8 is an alias for utf8mb3; the character limit is implicit, rather than explicit in the name.

Note

The utf8mb3 character set is deprecated and you should expect it to be removed in a future MySQL release. Please use utf8mb4 instead. Although utf8 is currently an alias for utf8mb3, at some point utf8 is expected to become a reference to utf8mb4. To avoid ambiguity about the meaning of utf8, consider specifying utf8mb4 explicitly for character set references instead of utf8.
[Wayback] MySQL :: MySQL 8.0 Reference Manual :: 10.9.1 The utf8mb4 Character Set (4-Byte UTF-8 Unicode Encoding)
utf8mb4 contrasts with the utf8mb3 character set, which supports only BMP characters and uses a maximum of three bytes per character:
- For a BMP character, utf8mb4 and utf8mb3 have identical storage characteristics: same code values, same encoding, same length.
- For a supplementary character, utf8mb4 requires four bytes to store it, whereas utf8mb3 cannot store the character at all. When converting utf8mb3 columns to utf8mb4, you need not worry about converting supplementary characters because there are none.

–jeroen

Posted in Conference Topics, Conferences, Database Development, Delphi, Development, Encoding, Event, MySQL, Software Development, UTF-8, UTF8 | Leave a Comment »

Some links and graphs on ESXi capping/throtteling disk speeds

Posted by jpluimers on 2022/01/06

As promised in “Solution” on ESXi 6.7 smartinfo throwing error Cannot open device, here are a few links in capping throttling disk speeds by ESXi followed by a few graphs of my own:

GUI operations might be capped, try dd: [Wayback] datastore – How can I speed up file tranfers on local storage in vSphere 5? – Server Fault

I never got any official confirmation for this, but I believe the I/O is capped (or at least de-prioritized) for datastore copy/move operations from the GUI as I have seen rather similar behaviour in different ESXi environments on from version 3.5.
[Wayback] Solved: ESXi disk performance, possibly capped? – VMware Technology Network VMTN

Question:

When I copy a 3GB ISO file from the raid set to the sata disk, the average disk transfer is 50MB/sec. This puzzles me, because the SATA disk can do 100MB/sec easy. When I start a second copy, the average disk transfer doubles to 100MB/sec.

…

Answer:

File transfer speeds are capped in the console so you will not be able to get maximum speeds

My own observations on ESXi 6.7 update 3:

One rsync operation:

1 rsync from 860 EVO SSD to 960 PRO NVMe
Two rsync operations:

2 rsync from 860 EVO SSD to 960 PRO NVM
Resume actions were about 10 times faster than the single rsync read speeds:

Resume action
Suspend actions were between 4 and 6 times faster than rsync write speeds:

Start of suspend action

Finish of suspend action

For each rsync operation, I had a separate SSH session going, and the speed doubled.

The resume action of all Virtual Machines was almost a flat speed curve.

The suspend action of all Virtual Machines started fast (when all machines were suspending) and finished slower (when only the largest virtual machines were still suspending)

–jeroen

Posted in ESXi5, ESXi5.1, ESXi5.5, ESXi6, ESXi6.5, ESXi6.7, ESXi7, Power User, Virtualization, VMware, VMware ESXi | Leave a Comment »

LIDL Radio Controlled Wall Clock IAN 100489 English manual

Posted by jpluimers on 2022/01/06

Model 100489-14-01 wall clock

Just in case I need it again.

The signal quality fluctuates during the day (it is a lot better at night when there is less inionisation in the atmosphere), and is worsened by concrete walls (like our home).

Best way to get prolonged reception is at night, on the top floor behind a window or outside.

The clock usually needs between 3 and 10 minutes to pick up the DCF77 signal from the transmitter.

Wall clock manual: [Wayback] 100489_EN.pdf of which this abstract:

DCF77 HD-1688 clock mechanism

Numbers:

M.SET button

Press and keep pressed the M.SET button 1 at least 3 seconds. The wall clock switches into manual mode.

Press and keep pressed the M.SET button again until the hands reach the correct position for you to set the time.

Briefly pressing the M.SET button moves the hands forward in one minute steps to enable you to set the current time manually.
Note: After 8 seconds without pressing the M.SET button, the wall clock switches out of manual mode and keeps the time as normal. The manually set value is overwritten as soon as reception of the DCF radio time signal is successful.

RESET button

Press the RESET button 2 to reset the radio clock settings. Alternatively, remove the batteries from the device and insert them again.

The product now automatically starts to search for the DCF radio time signal.

REC button

Press and keep pressed the REC button 3 at least 5 seconds. The wall clock attempts to receive the DCF radio time signal. This process takes a few minutes to complete.

Battery compartment

Battery type: 1 x 1.5 V ⎓ AA, LR6

More on the signal, transmitter and encoding: DCF77 – Wikipedia, where the below images are from:

DCF77 reception area from Mainflingen

DCF77 signal strength over a 24-hour period measured in Nerja, on the south coast of Spain 1,801 km (1,119 mi) from the transmitter. Around 1 AM it peaks at ≈ 100 µV/m signal strength. During the day, the signal is weakened by ionization of the ionosphere due to solar activity.

Another DCF77 clock I have: CSL Bearware 302658 DCF clock manual

–jeroen

Posted in Development, Encoding, Hardware Development, LifeHacker, Power User, Software Development | 2 Comments »

	Attila Kovacs on Crowbarring Windows 95 into Wi…
	Jeroen Wiert Pluimer… on Does Odido (the old T-Mobile N…
	Lars Fosdal on Security alarm provider Woonve…
	Thomas Mueller on Question got closed in May 202…
	Thaddy de Koning on Formulier voor bewindvoerders…

The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

Subscribe

Archives

Recent Comments

Recent Posts

Blog Stats

Meta title

Tag Cloud Title

Top Clicks

Top Posts

My badges

Twitter Updates

My Flickr Stream

Pages

All categories

Email Subscription

Archive for January 6th, 2022

When MySQL characterset ‘utf’ does not allow you to enter some Unicode code points

Some links and graphs on ESXi capping/throtteling disk speeds

LIDL Radio Controlled Wall Clock IAN 100489 English manual

The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

Subscribe

Archives

Recent Comments

Recent Posts

Blog Stats

Meta title

Tag Cloud Title

Top Clicks

Top Posts

My badges

Twitter Updates

My Flickr Stream

Pages

All categories

Email Subscription

Archive for January 6th, 2022

When MySQL characterset ‘utf’ does not allow you to enter some Unicode code points

Rate this:

Share this:

Some links and graphs on ESXi capping/throtteling disk speeds

Rate this:

Share this:

LIDL Radio Controlled Wall Clock IAN 100489 English manual

Rate this:

Share this: