The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,839 other subscribers

Archive for the ‘XML/XSD’ Category

Markdown has been the Internet’s lingua franca for documentation. Microsoft finally the documentation format with markitdown: Python tool for converting files and office documents to Markdown.

Posted by jpluimers on 2024/12/17

Finally an easier way to convert Office documents (and other formats) to markdown: [Wayback/Archive] GitHub – microsoft/markitdown: Python tool for converting files and office documents to Markdown. (after Google added a Markdown export feature to Google Docs about half a year ago, and basic Markdown formatting about 2 years ago – see below):

There are quite a few dependencies in [Wayback/Archive] markitdown/pyproject.toml at main · microsoft/markitdown · GitHub, so be prepared for that.

Supported formats (added links for clarity):

The MarkItDown library is a utility tool for converting various files to Markdown (e.g., for indexing, text analysis, etc.)
It presently supports:
  • PDF (.pdf)
  • PowerPoint (.pptx)
  • Word (.docx)
  • Excel (.xlsx)
  • Images (EXIF metadata, and OCR)
  • Audio (EXIF metadata, and speech transcription)
  • HTML (special handling of Wikipedia, etc.)
  • Various other text-based formats (csv, json, xml, etc.)

Google was first though:

  1. [Wayback/Archive] Google Workspace Updates: Compose with Markdown in Google Docs on web
  2. [Wayback/Archive] Google Workspace Updates: Import and export Markdown in Google Docs

There is speculation on why Microsoft introduced it just now ranging from “they need it for AI training” to “just late to the game”. I’m with the latter. Apple is even later, so if you want to convert Apple Notes to markdown, then you can use [Wayback/Archive] Import from Apple Notes – Obsidian Help.

Via various sources, including:

Read the rest of this entry »

Posted in CSV, Development, Excel, HTML, HTML5, JSON, Lightweight markup language, MarkDown, Office, PDF, Power Point, Power User, Software Development, Word, XML/XSD | Tagged: , | Leave a Comment »

ValueError: invalid literal for int() with base 10: ” by tzwickl · Pull Request #768 · sivel/speedtest-cli

Posted by jpluimers on 2024/11/22

Somehow this post missed the schedule and for a long time I forgot to properly checked for “missed schedule” posts.

Back in 2021, suddenly systems with speedtest-cli threw a [Wayback/Archive] ValueError: invalid literal for int() with base 10: ” by tzwickl · Pull Request #768 · sivel/speedtest-cli after accessing the speedtest.net servers.

Around 7-8 April, 2021 the speedtest.net/speedtest-config.php XML configuration suddenly had changed the value for the XPath expression /settings/server-config/@ignoreids from being a list of integers into empty, see the archived files below.

Read the rest of this entry »

Posted in Development, JavaScript/ECMAScript, Python, Scripting, Software Development, XML/XSD, XPath | Tagged: , | Leave a Comment »

XPath based bookmarklets for Archive.is: more JavaScript fiddling!

Posted by jpluimers on 2023/09/20

As I promised a few months back in Bookmarklets for Archive.is and the WayBack Machine to go to the original page, moar JavaScript fiddling, this time with XPath based bookmarklets to navigate from Archive.is pages to Saved From, Redirected from, Via and Original pages.

An alternative would be using XPath as the additional fields are always structured in a table like the html below (taking complex pages like https://archive.ph/5iVVH and https://archive.ph/2015.11.14-044109/http://www.example.org/ as an example).

I got triggered to using XPath from this answer from [Wayback/Archive] gdyrrahitis at [Wayback/Archive] Javascript .querySelector find by innerTEXT – Stack Overflow (thanks [Wayback/Archive] passwd for asking):

Read the rest of this entry »

Posted in Agile, Bookmarklet, Code Quality, Code Review, Development, HTML, JavaScript/ECMAScript, Power User, Scripting, Software Development, Web Browsers, Web Development, XML/XSD, XPath | Leave a Comment »

HTML / XML / RSS link checker – Visual Studio Marketplace

Posted by jpluimers on 2022/10/04

On my list of Visual Studio Code extensions to try (after I change the shortcuts, as direct Alt shortcuts are not a good idea, luckily those are configurable)

[Wayback/Archive.is] HTML / XML / RSS link checker – Visual Studio Marketplace (partly paraphrased):

VSCode extension that checks for broken links in an HTML, XML, RSS, PHP, or Markdown file.

Checks currently open file:

  • for broken links in anchor-href, link-href, img-src, and script-src tags in currently-open HTML or PHP file
  • both clearnet and onion (Tor) links
  • for badly-formatted mailto links, and duplicate local anchors (anchor-name, anchor-id)
  • for working HTTPS equivalents of HTTP links

Optionally checks for invalid characters and common mistakes (missing tag content, empty attribute value, more).

Also checks for errors in a small subset of semantic HTML tags (in HTML and PHP files): checks that each page has header, main, footer; checks that each heading is inside a section, article, or aside; checks that each section/article/aside has exactly one heading in it; checks that heading values are nested properly.

To see/change settings for this extension, open Settings (Ctrl+,) / Extensions / “HTML / XML / RSS link checker”.

To change the key-combinations for this extension, open File / Preferences / Keyboard Shortcuts and search for Alt+H or Alt+T or Alt+M or Alt+L.

–jeroen

Posted in .NET, Development, HTML, Lightweight markup language, MarkDown, Power User, RSS, Software Development, vscode Visual Studio Code, Web Development, XML, XML/XSD | Leave a Comment »

.NET: XML escaping a string

Posted by jpluimers on 2022/02/15

[Wayback] WILT: XML encode a string in .net « Benoit MARTIN’s Weblog:

Always wondered why I couldn’t find a method that would XML encode a string, effectively escaping the 5 illegal characters for XML. There is such a method but its location in the API is not intuitive at all. It’s in the System.Security namespace: [Wayback] SecurityElement.Escape(String) Method (System.Security) | Microsoft Docs

public static string? Escape (string? str);

Its usage is:

   tagText = System.Security.SecurityElement.Escape(tagText);

This will escape the 5 characters <, >, &, " and '

–jeroen

Posted in .NET, Development, Encoding, Software Development, XML, XML escapes, XML/XSD | Leave a Comment »

Digging Through Event Log Hell (finding user logon & logoff) – Ars Technica OpenForum

Posted by jpluimers on 2021/08/31

This helped me big time finding failed logon attempts: [WayBack] Event Log Hell (finding user logon & logoff) – Ars Technica OpenForum

Alternatively, you can use the XPath query mechanism included in the Windows 7 event viewer. In the event viewer, select “Filter Current Log…”, choose the XML tab, tick “Edit query manually”, then copy the following to the textbox:

Code:
<QueryList>
  <Query Id="0" Path="Security">
    <Select Path="Security">*[System[EventID=4624] and EventData[Data[@Name='TargetUserName'] = 'USERNAME']]</Select>
  </Query>
</QueryList>

This selects all events from the Security log with EventID 4624 where the EventData contains a Data node with a Name value of TargetUserName that is equal to USERNAME. Remember to replace USERNAME with the name of the user you’re looking for.

If you need to be even more specific, you can use additional XPath querying – have a look at the detail view of an event and select the XML view to see the data that you are querying into.

Thanks user Hamstro!

Notes:

Related:

–jeroen

Posted in Development, Microsoft Surface on Windows 7, Power User, Software Development, Windows, Windows 10, Windows 7, Windows 8, Windows 8.1, Windows 9, Windows Vista, Windows XP, XML/XSD | Leave a Comment »

Getting the path of an XML node in your code editor

Posted by jpluimers on 2021/05/27

A few links for my link archive, as I often edit XML files (usually with different extensions than .xml, because historic choices that software development vendors make, which makes it way harder to tell editors “yes, this too is XML).

–jeroen

Read the rest of this entry »

Posted in .NET, Development, Notepad++, Power User, Software Development, Text Editors, Visual Studio and tools, vscode Visual Studio Code, XML, XML/XSD | Leave a Comment »

Chocolatey: installing Oracle SQL Developer and updating the chocolatey package

Posted by jpluimers on 2021/05/13

Sometimes an install is not just as simple as C:\>choco install --yes oracle-sql-developer.

Edit 20210514:

Note that most of the below pain will be moot in the future as per [Archive.is] Jeff Smith 🍻 on Twitter: “we’re working on removing the SSO requirement, it’s already done for @oraclesqlcl – see here … “ referring to [Wayback] SQLcl now under the Oracle Free Use Terms and Conditions license | Oracle Database Insider Blog

SQLcl, the modern command-line interface for the Oracle Database, can now be downloaded directly from the web without any click-through license agreement.

It means the Oracle acount restriction will be lifted, and downloads will be a lot simpler.

I started with the below failing command, tried a lot of things, then finally almost gave up: Oracle stuff does not want to be automated, which means I should try to less of their stuff.

First of all you need an Oracle account (I dislike companies doing that for free product installs; I’m looking at Embarcadero too) by going to profile.oracle.com:

[WayBack] Chocolatey Gallery | Oracle SQL Developer 18.4.0 (also: gist.github.com/search?l=XML&q=oracle-sql-developer)

Notes

  • This version supports both 32bit and 64bit and subsequently does not have a JDK bundled with it. It has a
    dependency on the jdk8 package to meet the application’s JDK requirement.
  • An Oracle account is required to download this package. See the “Package Parameters” section below for
    details on how to provide your Oracle credentials to the installer. If you don’t have an existing account, you can
    create one for free here: https://profile.oracle.com/myprofile/account/create-account.jspx

Package Parameters

The following package parameters are required:

/Username: – Oracle username
/Password: – Oracle password

(e.g. choco install oracle-sql-developer --params "'/Username:MyUsername /Password:MyPassword'")

To have choco remember parameters on upgrade, be sure to set choco feature enable -n=useRememberedArgumentsForUpgrades.

Then the installation failed fail again: ERROR: The response content cannot be parsed because the Internet Explorer engine is not available, or Internet Explorer's first-launch configuration is not complete. Specify the UseBasicParsing parameter and try again.

The trick is to RUN IEXPLORE.EXE AS ADMINISTRATOR ONCE BEFORE INSTALLING FROM CHOCOLATEY. Who would believe that.

The reason is that the package uses Invoke-WebRequest which requires Internet Explorer and PowerShell 3. Chocolatey packages however need to be able to run on just PowerShell 2 without Invoke-WebRequest.

Maybe using cURL can remedy that; adding a dependency to is is possible, as cURL can be installed via chocolatey: [WayBack] How to Install cURL on Windows – I Don’t Know, Read The Manual. Another alternative might be [WayBack] Replace Invoke-RestMethod in PowerShell 2.0 to use [WayBack] WebRequest Class (System.Net) | Microsoft Docs.

Read the rest of this entry »

Posted in CertUtil, Chocolatey, CommandLine, Database Development, Development, DVCS - Distributed Version Control, git, Hashing, OracleDB, Power User, PowerShell, Security, SHA, SHA-1, Software Development, Source Code Management, Windows, XML, XML/XSD | Leave a Comment »

Default XML encoding is UTF-8 (or better: utf-8). If it contains other byte sequences, this is an error.

Posted by jpluimers on 2021/01/21

I should have had the below answer when writing about StUF – receiving data from a provider where UTF-8 is in fact ISO-8859.

A while ago, a co-worker did not believe when I told that default XML encoding really is UTF-8 (and tried to force it to utf-8), and that if the content had byte sequences different from the (either specified or default) encoding, it was a problem.

I though I blogged about the default, and where to find it, but apparently, I did not.

My blog had (and has <g>) a truckload of articles mentioning UTF-8, less articles containing UTF-8, encoding and xml, but the ones having UTF-8, default, encoding and xml did not actually tell about a standard that really defines XML uses UTF-8 as default encoding when there is no other encoding information – like BOM (byte order mark), HTTP, or MIME encoding) available.

W3C indeed specifies it. [WayBack] utf 8 – How default is the default encoding (UTF-8) in the XML Declaration? – Stack Overflow has a summary (thanks James Holderness!):

The Short Answer

Under the very specific circumstances of a UTF-8 encoded document with no external encoding information (which I understand from the comments is what you’re interested in), there is no difference between the two declarations.

The long answer is far more interesting though.

and an elaboration:

Read the rest of this entry »

Posted in Development, Encoding, Software Development, UTF-8, UTF8, XML, XML/XSD | Leave a Comment »

WSA: Web-Service Addressing

Posted by jpluimers on 2020/12/08

I don’t do SOAP that often any more, so here some links on it and some notes on how one site used some of the fields:

A few observations from real life:

  • Inside the WS-Addressing realm:
    • Action has a URI indicating what to execute inside the service
    • From is basically abused because it
      1. is not used as a source endpoint but
      2. has an Address element that contains both Action and authentication
    • MessageID uses a uuid: based URI
  • Outside the WS-Addressing realm, in the main SOAP body:
    • Since is implemented using a non ISO-8601 compliant timestamp: it barfs on the second fraction and on the time zone (neither Z nor an offset based time-zone are accepted).

I did not know you could have uuid based URIs, as they are not mentioned here:

But apparently they have been in use for quite a while:

–jeroen

Posted in Development, SOAP/WebServices, Software Development, XML, XML/XSD | Leave a Comment »