Coping with UTF-16 / UCS-2 little endian in Batch files: numbers from WMIC
Posted by jpluimers on 2016/11/22
A while ago, I needed to get the various date, time and week values from WMIC to environment variables with pre-padded zeros. I thought: easy job, just write a batch file.
Tough luck: I couldn’t get the values to expand properly. Which in the end was caused by WMIC emitting UTF-16 and the command-interpreter not expecting double-byte character sets which messed up my original batch file.
|What I wanted||What I got|
As Windows uses little-endian encoding by default, the high byte (which is zero) of a UTF-16 code point with ASCII characters comes first. That messes up the command interpreter.
His solution is centered around
set /A, which:
- handles integer numbers and calls them “numeric” (hinting floating point, but those are truncated to integer; one of the tricks rojo uses)
- and (be careful with this as 08 and 09 are not octal numbers) uses these prefixes:
- 0 for Octal
- 0x for hexadecimal
Enjoy and shiver with the online help extract:
SET /A expression SET /P variable=[promptString] The /A switch specifies that the string to the right of the equal sign is a numerical expression that is evaluated. The expression evaluator is pretty simple and supports the following operations, in decreasing order of precedence: ... If you use any of the logical or modulus operators, you will need to enclose the expression string in quotes. Any non-numeric strings in the expression are treated as environment variable names whose values are converted to numbers before using them. If an environment variable name is specified but is not defined in the current environment, then a value of zero is used. This allows you to do arithmetic with environment variable values without having to type all those % signs to get their values. ... Numeric values are decimal numbers, unless prefixed by 0x for hexadecimal numbers, and 0 for octal numbers. So 0x12 is the same as 18 is the same as 022. Please note that the octal notation can be confusing: 08 and 09 are not valid numbers because 8 and 9 are not valid octal digits.
Anyway: here is the answer with batch file, where you can remove “@echo off” to see the UCS-2 adding spurious new-lines:
Also, I think part of the problem is that the encoding of values returned from WMI queries are encoded in UCS-2 Little Endian, which does weird things to an ANSI runtime. I found a way to get around that using
set /a, appending
.0to each value (which is immediately dropped, since
set /aonly computes integers), and black holing error messages.
@echo off setlocal enabledelayedexpansion for /f "delims=" %%I in ('wmic path win32_localtime get * /format:list ^| findstr "="') do ( 2>NUL set /a "wmic_%%I.0" ) for /f "tokens=1,2 delims==" %%I in ('set wmic_') do ( if %%J leq 9 set "%%I=0%%J" ) set wmic_ endlocal enabledelayedexpansion
It’s ugly, but it seems to produce the output you want. The end justifies the means, I guess. :)