The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,352 other followers

Archive for the ‘RegEx’ Category

VMware ESXi console: viewing all VMs, suspending and waking them up: part 1

Posted by jpluimers on 2021/04/22

I think the easiest way to list all VMs is the vim-cmd vmsvc/getallvms command, but it has a big downside: the downside is a mess.

The reason is that the output has a lot of columns (Vmid, Name, Datastore, File, Guest OS, Version, Annotation), more than 500 characters per line (eat that 1080p monitor!), and potentially more than one line per VM as the Annotation is a free-text field that can have newlines.

Example output on one of my machines:

Vmid Name File Guest OS Version Annotation
10 X9SRI-3F-W10P-EN-MEDIA [EVO860_500GB] VM/X9SRI-3F-W10P-EN-MEDIA/X9SRI-3F-W10P-EN-MEDIA.vmx windows9_64Guest vmx-14
5 PPB Local_Virtual Machine_v4.0 [EVO860_500GB] VM/PPB-Local_Virtual-Machine_v4.0/PPB Local_Virtual Machine_v4.0.vmx centos64Guest vmx-11 PowerPanel Business software(Local) provides the service which communicates
with the UPS through USB or Serial cable and relays the UPS state to each Remote on other computers
via a network.
It also monitors and logs the UPS status. The computer which has been installed the Local provides
graceful,
unattended shutdown in the event of the power outage to protect the hosted computer.

As an alternative, you could use esxcli vm process list, but that gives IDs that are way harder to remember:

PPB Local_Virtual Machine_v4.0
World ID: 2099719
Process ID: 0
VMX Cartel ID: 2099713
UUID: 56 4d 74 f8 c8 22 41 27-a3 88 49 df 8b dc d6 63
Display Name: PPB Local_Virtual Machine_v4.0
Config File: /vmfs/volumes/5d35e7d8-e8df636f-46b9-0025907d9d5c/VM/PPB-Local_Virtual-Machine_v4.0/PPB Local_Virtual Machine_v4.0.vmx
X9SRI-3F-W10P-EN-MEDIA
World ID: 2099728
Process ID: 0
VMX Cartel ID: 2099717
UUID: 56 4d 51 ac f6 cf e4 0b-b6 86 2f 53 a2 8a 4b ea
Display Name: X9SRI-3F-W10P-EN-MEDIA
Config File: /vmfs/volumes/5d35e7d8-e8df636f-46b9-0025907d9d5c/VM/X9SRI-3F-W10P-EN-MEDIA/X9SRI-3F-W10P-EN-MEDIA.vmx

I got both of the above commands from [Wayback] VMware Knowledge Base: Performing common virtual machine-related tasks with command-line utilities (2012964).

Back to the columns that vim-cmd vmsvc/getallvms returns:

  • Vmid is an unsigned integer
  • Name can have spaces
  • Datastore has square brackets [ and ] around it
  • File can contain spaces
  • Guest OS is an identifier without spaces (it is a value from [Wayback] the vSphere API VcVirtualMachineGuestOsIdentifier
  • Version looks like vmx-# where # is an unsigned integer
  • Annotation is multi-line free-form so potentially can have lines starting like being Vmid, but the chance that a line looks exactly like a non-annotated one is very low

So let’s find a grep or  sed filter to get just the lines without annotation continuations. Though in general I try to avoid regular expressions as they are hard to both write and read, but with Busybox there is no much choice.

I choose sed, just in case I wanted to do some manipulation in addition to matching.

Busybox sed

Though the source code [Wayback] sed.c\editors – busybox – BusyBox: The Swiss Army Knife of Embedded Linux indicates sed.c - very minimalist version of sed, the implementation actually is reasonably feature rich, just not feature complete. That’s OK given the aim of Busybox to be small.

Luckily, deep in the busybox sed code, it indicates that extended regular expressions are supported (support is in [Wayback] /uClibc/plain/libc/misc/regex/regcomp.c (look for regcomp, do not get confused by xregcomp on call sites as that is [Wayback] just a tiny wrapper to call regcomp).

The support has become better over time, like [Wayback] gnu – sed Command on BusyBox expects different syntax? – Super User shows.

This means far less escaping than basic regular expressions, capture groups are supported as well as character classes (so [[:digit:]] is more readable than [0-9]), and the + is supported to match once or more (so [0-9]+ means one or more digits, as does [[:digit:]]+, but [d]+ or \d+ don’t ). Unfortunately named capture groups are not supported (so documenting parts of the regular expression like (?<Vmid>^[[:digit:]]+) is not possible, it will give you an error [Wayback] Invalid preceding regular expression).

But first a few of the sed commandline options and their order:

vim-cmd vmsvc/getallvms | sed -n -E -e '/(^[[:digit:]]+)/p'
  1. -n outputs only matching lines that have a p print command.
  2. -E allows extended regular expressions (you can also use -r for that)
  3. -e adds a (in this case extended) regular expression
  4. '/(^[[:digit:]]+)/p' is the extended regular expression embedded in quotes
    1. / at the start indicates that sed should match the regular expression on each line it parses
    2. /p at the end indicates the matching line should be printed
    3. Parentheses ( and ) surround a capture group
    4. ^[[:digit:]]+ matches 1 or more digits at the start of the line

The grep command is indeed much shorter, but does not allow post-editing:

vim-cmd vmsvc/getallvms | grep -E '(^[[:digit:]]+)'

Building a sed filter

I came up with the below sed regular expression to filter out lines:

  1. starting with a Vmid unsigned integer
  2. having a [Datastore] before the File
  3. have a Guest OS identifier after File
  4. have a Version matching vmx-# after File where # is an unsigned integer
  5. optionally has an Annotation after Version
vim-cmd vmsvc/getallvms | sed -n -E -e  "/^([[:digit:]]+)(\s+)((\S.+\S)?)(\s+)(\[\S+\])(\s+)(.+\.vmx)(\s+)(\S+)(\s+)(vmx-[[:digit:]]
+)(\s*?)((\S.+)?)$/p"

A longer expression that I used to fiddle around with is at regex101.com/r/A7MfKu and contains named capture groups. I had to nest a few groups and use the ? non-greedy (or lazy) operator a few times to ensure the fields would not include the spaces between the columns.

Others use different expressions as for instance explained in [Wayback] Get all VMs with “vmware-vim-cmd vmsvc/getallvms” – VMware Technology Network VMTN:

Output from “vim-cmd vmsvc/getallvms” is really challenging to process. Our normal approaches such as awk column indexes, character index, and regular expression are all error prone here. The character index of each column varies depending on maximum field length of, for example, VM name. And the presence of spaces in VM names throws off processing as awk columns. And VM name could contain almost any character, foiling regex’s.

Printing capture groups

The cool thing is that it is straightforward to modify the expression to print any of the capture groups in the order you wish: you convert the match expression (/match/p) into a replacement expression (s/match/replace/p) and print the required capture groups in the replace part. A short example is at [Wayback] regex – How to output only captured groups with sed? – Stack Overflow.

There is one gotcha though: Busybox sed only allows single-digit capture group numbers, and we have far more than 9 capture groups. This fails and prints 0 after the output of capture group 1 instead of printing capture group 10, similar for 2 after group 1 instead of printing group 12:

vim-cmd vmsvc/getallvms | sed -n -E -e  "s/^([[:digit:]]+)(\s+)((\S.+\S)?)(\s+)(\[\S+\])(\s+)(.+\.vmx)(\s+)(\S+)(\s+)(vmx-[[:digit:]]+)(\s*?)((\S.+)?)$/Vmid:\1 Guest:\10 Version:\12 Name:\3 Datastore:\7 File:\8/p"

So we need to cut down on capture groups first by removing all capture groups around the \s white-space matching:

vim-cmd vmsvc/getallvms | sed -n -E -e  "/^([[:digit:]]+)\s+((\S.+\S)?)\s+(\[\S+\])\s+(.+\.vmx)\s+(\S+)\s+(vmx-[[:digit:]]+)\s*?((\S.+)?)$/p"

Then we get this to print some of the capture groups:

vim-cmd vmsvc/getallvms | sed -n -E -e "s/^([[:digit:]]+)\s+((\S.+\S)?)\s+(\[\S+\])\s+(.+\.vmx)\s+(\S+)\s+(vmx-[[:digit:]]+)\s*?((\S.+)?)$/Vmid:\1 Guest:\6 Version:\7 Name:\3 Datastore:\4 File:\5 Annotation:\8/p"

With this output:

Vmid:10 Guest:windows9_64Guest Version:vmx-14 Name:X9SRI-3F-W10P-EN-MEDIA Datastore:[EVO860_500GB] File:VM/X9SRI-3F-W10P-EN-MEDIA/X9SRI-3F-W10P-EN-MEDIA.vmx Annotation:
Vmid:5 Guest:centos64Guest Version:vmx-11 Name:PPB Local_Virtual Machine_v4.0 Datastore:[EVO860_500GB] File:VM/PPB-Local_Virtual-Machine_v4.0/PPB Local_Virtual Machine_v4.0.vmx Annotation:PowerPanel Business software(Local) provides the service which communicates

Figuring out power state for each VM

This will be in the next installment, as by now this already has become a big blog-post (:

–jeroen

Posted in *nix, *nix-tools, ash/dash, ash/dash development, Development, ESXi6, ESXi6.5, ESXi6.7, ESXi7, Power User, RegEx, Scripting, Software Development, Virtualization, VMware, VMware ESXi | Leave a Comment »

Delphi TRegExOption: Where is description of roNotEmpty option? What does this option do? – Jacek Laskowski – Google+

Posted by jpluimers on 2020/12/10

I really dislike using regular expressions, mainly because every time I bump into code using them either:

  • I cannot decipher them any more
  • It is used for things not suites for (like parsing JSON or XML: please don’t!)

For more background on when NOT to use regular expressions, remember they describe a regular grammar, and can only me implemented by a finite state machine (a state machine that can be exactly one state out of a set of finite states).

As soon as you need to parse something that needs multiple states at once, or the number of states becomes infinite,

Some background reading:

Read the rest of this entry »

Posted in Delphi, Development, RegEx, Software Development | Leave a Comment »

shell – How do I grep for multiple patterns with pattern having a pipe character? – Unix & Linux Stack Exchange

Posted by jpluimers on 2020/10/27

Since I keep forgetting this – especially because I cannot remember the “why”: [WayBack] shell – How do I grep for multiple patterns with pattern having a pipe character? – Unix & Linux Stack Exchange by “user unknown“.

The -E means using Regular expression: POSIX extended – Wikipedia.

egrep "foo|bar" *.txt

or

grep "foo\|bar" *.txt
grep -E "foo|bar" *.txt

selectively citing the man page of gnu-grep:

   -E, --extended-regexp
          Interpret PATTERN as an extended regular expression (ERE, see below).  (-E is specified by POSIX.)

Matching Control
   -e PATTERN, --regexp=PATTERN
          Use PATTERN as the pattern.  This can be used to specify multiple search patterns, or to protect  a  pattern
          beginning with a hyphen (-).  (-e is specified by POSIX.)

(…)

   grep understands two different versions of regular expression syntax: basic and extended.”  In  GNU grep,  there
   is  no  difference  in  available  functionality  using  either  syntax.   In  other implementations, basic regular
   expressions are less powerful.  The following description applies to extended regular expressions; differences  for
   basic regular expressions are summarized afterwards.

In the beginning I didn’t read further, so I didn’t recognize the subtle differences:

Basic vs Extended Regular Expressions
   In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead  use  the
   backslashed versions \?, \+, \{, \|, \(, and \).

I always used egrep and needlessly parens, because I learned from examples. Now I learned something new. :)

–jeroen

Posted in Development, RegEx, Software Development | Leave a Comment »

sed double expression: match, replace in one line, overwrite file

Posted by jpluimers on 2020/04/15

A while ago, I needed to conditionally replace in files, so I used sed and a regular expression, though usually I dislike those.

However, since the system had a very basic install, there was not much choice.

Luckily back then, my Google foo returned these:

This allowed me to do a double expression (the first matches a pattern, the second performs the actual replacement within the matching lines).

In case my Google foo in the future fails:

## https://robots.thoughtbot.com/sed-102-replace-in-place
## -i causes no backup to be saved, but does in-place replacement
## since we run under git, we can always restore
## combined with a double expression (the first matches, the second executes) this is very powerful
sed -i -e '/#.*AVOID_DAILY_AUTOCOMMITS=.*$/s/^.//' /etc/etckeeper/etckeeper.conf && git diff | more

–jeroen

Posted in Development, RegEx, Software Development | Leave a Comment »

GExperts: searching for case-insensitive “T*List.Create” but not “TStringList.Create”

Posted by jpluimers on 2019/05/02

Just learned that partial exclusion can be done with the case-insensitive GExperts Grep Search like this:

T[^s][^t][^r][^i][^n][^g].*List.*\.Create

This will skip TStringList.Create, but matches TMyList.Create.

I’d rather have done something like this, but the Delphi RegEx does not support negative lookbehind:

^ *[a-zA-Z0-9_]* *: *T(<!string)[a-zA-Z0-9_]*ListO? *;$

So the alternative is to search for this:

^ *[a-zA-Z0-9_]* *: *T[a-zA-Z0-9_]*ListO? *;$

then exclude all the case insensitive TStringList entries from it, however GExperts did not support that at the time of writing.

This is an intermediate that works for some of the times:

^ *[a-zA-Z0-9_]* *: *T[^s][^t][^r][^i][^n][^g][a-zA-Z0-9_]*ListO? *;$

–jeroen

^ *[a-zA-Z0-9_]* *: *T[^s][^t][^r][^i][^n][^g][a-zA-Z0-9_]*ListO? *$;
^ *[a-zA-Z0-9_]*: .T[a-zA-Z0-9_]*ListO? *;$;
^ *[a-zA-Z0-9_]* *: *T(?!string)[a-zA-Z0-9_]*ListO? *;$

Posted in Delphi, Development, GExperts, RegEx, Software Development | Leave a Comment »

 
%d bloggers like this: