The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,860 other subscribers

Markdown has been the Internet’s lingua franca for documentation. Microsoft finally the documentation format with markitdown: Python tool for converting files and office documents to Markdown.

Posted by jpluimers on 2024/12/17

Finally an easier way to convert Office documents (and other formats) to markdown: [Wayback/Archive] GitHub – microsoft/markitdown: Python tool for converting files and office documents to Markdown. (after Google added a Markdown export feature to Google Docs about half a year ago, and basic Markdown formatting about 2 years ago – see below):

There are quite a few dependencies in [Wayback/Archive] markitdown/pyproject.toml at main · microsoft/markitdown · GitHub, so be prepared for that.

Supported formats (added links for clarity):

The MarkItDown library is a utility tool for converting various files to Markdown (e.g., for indexing, text analysis, etc.)
It presently supports:
  • PDF (.pdf)
  • PowerPoint (.pptx)
  • Word (.docx)
  • Excel (.xlsx)
  • Images (EXIF metadata, and OCR)
  • Audio (EXIF metadata, and speech transcription)
  • HTML (special handling of Wikipedia, etc.)
  • Various other text-based formats (csv, json, xml, etc.)

Google was first though:

  1. [Wayback/Archive] Google Workspace Updates: Compose with Markdown in Google Docs on web
  2. [Wayback/Archive] Google Workspace Updates: Import and export Markdown in Google Docs

There is speculation on why Microsoft introduced it just now ranging from “they need it for AI training” to “just late to the game”. I’m with the latter. Apple is even later, so if you want to convert Apple Notes to markdown, then you can use [Wayback/Archive] Import from Apple Notes – Obsidian Help.

Via various sources, including:

--jeroen

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.