Archive for the ‘CSV’ Category
Posted by jpluimers on 2025/11/04
While blogging, online tools often beat offline or command-line tools, so here there are:
They use JavaScript and do client-side conversion.
There are way more conversion targets (Delimited, Flat File, GeoJSON, HTML Table, JSON, KML, Markdown, Multi-line Data, PDF, SQL, Word, XML, YAML) and operations (Pivot, Transpose, Query with SQL), but the above are what I use most.
–jeroen
Posted in Blogging, CSV, Development, Power User, SocialMedia, Software Development | Leave a Comment »
Posted by jpluimers on 2025/04/24
[Wayback/Archive] One-liner for running queries against CSV files with SQLite | Simon Willison’s TILs
I figured out how to run a SQL query directly against a CSV file using the sqlite3 command-line utility:
sqlite3 :memory: -cmd '.mode csv' -cmd '.import taxi.csv taxi' \
'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count'
This uses the special :memory: filename to open an in-memory database. Then it uses two -cmd options to turn on CSV mode and import the taxi.csv file into a table called taxi. Then it runs the SQL query.
sqlite3 :memory: -cmd '.import -csv taxi.csv taxi' \
'SELECT passenger_count, COUNT(*), AVG(total_amount) FROM taxi GROUP BY passenger_count'
Via [Wayback/Archive] Simon Willison on Twitter: “TIL you can run SQL queries directly against CSV files as a one-liner using the default sqlite3 command line utility”
Read the rest of this entry »
Posted in CSV, Database Development, Development, Software Development, SQLite | Leave a Comment »
Posted by jpluimers on 2024/12/17
Finally an easier way to convert Office documents (and other formats) to markdown: [Wayback/Archive] GitHub – microsoft/markitdown: Python tool for converting files and office documents to Markdown. (after Google added a Markdown export feature to Google Docs about half a year ago, and basic Markdown formatting about 2 years ago – see below):
There are quite a few dependencies in [Wayback/Archive] markitdown/pyproject.toml at main · microsoft/markitdown · GitHub, so be prepared for that.
Supported formats (added links for clarity):
The MarkItDown library is a utility tool for converting various files to Markdown (e.g., for indexing, text analysis, etc.)
It presently supports:
- PDF (.pdf)
- PowerPoint (.pptx)
- Word (.docx)
- Excel (.xlsx)
- Images (EXIF metadata, and OCR)
- Audio (EXIF metadata, and speech transcription)
- HTML (special handling of Wikipedia, etc.)
- Various other text-based formats (csv, json, xml, etc.)
Google was first though:
- [Wayback/Archive] Google Workspace Updates: Compose with Markdown in Google Docs on web
- [Wayback/Archive] Google Workspace Updates: Import and export Markdown in Google Docs
There is speculation on why Microsoft introduced it just now ranging from “they need it for AI training” to “just late to the game”. I’m with the latter. Apple is even later, so if you want to convert Apple Notes to markdown, then you can use [Wayback/Archive] Import from Apple Notes – Obsidian Help.
Via various sources, including:
Read the rest of this entry »
Posted in CSV, Development, Excel, HTML, HTML5, JSON, Lightweight markup language, MarkDown, Office, PDF, Power Point, Power User, Software Development, Word, XML/XSD | Tagged: MarkDown, Python | Leave a Comment »
Posted by jpluimers on 2023/10/23
Remember [Wayback/Archive] Guidelines for human gene nomenclature | Nature Genetics?**
You might not, but this was what pointed me to it back in 2020: [Wayback/Archive] Scientists rename human genes to stop Microsoft Excel from misreading them as dates – The Verge.
The article was a result of Excel mangling import data for decades. Somehow finally it did get Microsoft’s attention and more than 3 years later, they issued options (with mangling still being the default) to help workaround the problems.
The 2004 article [Wayback/Archive] Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics | BMC Bioinformatics | Full Text demonstrated this import problem which had been present for quite a while already (it even has a csh Script to scan for SymbolMutation error).
The gene nomenclature people by now have moved to a different naming scheme, but maybe other people can benefit from the Excel updates of which you can find more through these links:
Read the rest of this entry »
Posted in CSV, Development, Excel, Office, Power User, Software Development | Leave a Comment »
Posted by jpluimers on 2019/01/03
TL;DR: use Markdown Tables generator – TablesGenerator.com as it has the most features.
A few tools that help converting CSV (with separators like comma, semicolon and tab) to Markdown online:
- [Archive.cs] Markdown Table Maker
- Supports:
- Use first line as headers
- Auto detection of separator
- Tab separated
- Comma separated
- Semicolon separated
- Does not support:
- [WayBack] CSV to Markdown Table Generator — Donat Studios
- Supports:
- Use first line as headers
- Tab separated
- Comma separated
- Semicolon separated
- Does not support:
- [WayBack] Markdown Tables generator – TablesGenerator.com
- Supports auto detection of:
- Use first line as headers
- Tab separated
- Comma separated
- Semicolon separated
- Quote characters
–jeroen
Posted in CSV, Development, Lightweight markup language, MarkDown, Power User, Software Development | Leave a Comment »
Posted by jpluimers on 2015/01/27
A few libraries for writing and/or reading CSV files in .NET:
Most of the above links come from these SO questions:
Together with the links from my previous CSV post If you think CSV is easy; think again that should get everyone going.
–jeroen
Posted in .NET, .NET 2.0, .NET 3.0, .NET 3.5, .NET 4.0, .NET 4.5, C#, C# 2.0, C# 3.0, C# 4.0, C# 5.0, CSV, Development, Software Development | Leave a Comment »
Posted by jpluimers on 2014/10/16
A big part of the cloud is not about storage, it is about on-line tools that run in your web-browser so you do not have to install them locally.
Quite a bit of my XML work can be done with on-line tools like these:
–jeroen
Posted in " quot, & amp, > gt, < lt, ' apos, CSV, Development, nbsp, Software Development, XML, XML escapes, XML/XSD, XPath, XSD, XSLT | Leave a Comment »
Posted by jpluimers on 2013/12/04
From my link archive:
Note that for importing decimal/numeric columns, you have two options:
- Cast through FLOAT using a FORMAT file
- Use OpenRowSet with VARCHAR, then CAST afterwards
Weird rounding for decimal while doing a bulk insert from a CSV.
Some more links on this:
–jeroen
Posted in Algorithms, CSV, Database Development, Development, Floating point handling, Software Development, SQL Server, SQL Server 2005, SQL Server 2008, SQL Server 2008 R2, SQL Server 2012 | Leave a Comment »
Posted by jpluimers on 2013/11/26
For some remote monitoring, I needed to get information on UNC paths.
Though suggested, you cannot do this using the System.IO.DriveInfo class (not through the constructor, nor through the VB.NET FileSystem way) as that is about drives, not UNC paths. The System.IO.DriveInfo constructor clearly indicates it doesn’t work with UNC paths. And if you still try, this is the error you will get:
System.ArgumentException was unhandled
HResult=-2147024809
Message=Object must be a root directory ("C:\") or a drive letter ("C").
Source=mscorlib
StackTrace:
at System.IO.DriveInfo..ctor(String driveName)
Same for WMI: that only works when the UNC path has already been mapped to a drive letter.
You could do with adding a temporary drive letter but since there is nothing as permanent as a temporary…
P/Invoke
The actual solution is based on calling Windows API functions using P/Invoke. Read the rest of this entry »
Posted in .NET, .NET 2.0, .NET 3.0, .NET 3.5, .NET 4.0, .NET 4.5, C#, C# 2.0, C# 3.0, C# 4.0, C# 5.0, CSV, Development, Missed Schedule, SocialMedia, Software Development, WordPress | Leave a Comment »
Posted by jpluimers on 2013/09/10
Just came across this nice answer by harpo containing a small class that can Escape/Unescape double-quotes in strings.
–jeroen
via: Good CSV Writer for C#? – Stack Overflow.
Posted in .NET, .NET 1.x, .NET 2.0, .NET 3.0, .NET 3.5, .NET 4.0, .NET 4.5, C#, C# 1.0, C# 2.0, C# 3.0, C# 4.0, C# 5.0, CSV, Development, Software Development | Leave a Comment »