The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 2,115 other followers

Archive for the ‘PDF’ Category

binaryfiles – How to convert PDF binary parts into ASCII/ANSI so I can look at it in a text editor? – Stack Overflow

Posted by jpluimers on 2020/06/30

The first hit of pdf binary to text – Google Search was [WayBack] binaryfiles – How to convert PDF binary parts into ASCII/ANSI so I can look at it in a text editor? – Stack Overflow has many options including:

Since I have qpdf installed on most systems:

Another useful tool to transform a PDF into an internal format that enables text editor access is qpdf. It is a “command-line program that does structural, content-preserving transformations on PDF files”.

Example usage:

 qpdf                                  \
   --qdf                               \
   --object-streams=disable            \
     input-with-compressed-objects.pdf \
     output-with-expanded-objects.pdf
  1. The output of the QDF-mode enforced by the --qdf switch organizes and re-orders the objects neatly. It adds comments to track the original object IDs and page content streams. All object dictionaries are written into a “normalized” standard format for easier parsing.
  2. The --object-streams=disable causes the extraction of (otherwise not recognizable) individual objects that are compressed into another object’s stream data.

The recompress is easy as per [WayBackQPDF Manual:

qpdf /tmp/uncompressed.pdf /tmp/compressed.pdf

The answer is by [WayBack] User Kurt Pfeifle – Stack Overflow who has many other interesting PDF related answers at:

Stackoverflow.com:

Superuser.com:

Serverfault.com:

–jeroen

Posted in Development, EPS/PostScript, PDF, Power User | Leave a Comment »

Some tools useful for analysing PDF documents

Posted by jpluimers on 2020/03/05

A while ago, I wanted to analyse the difference of some PDF documents: why they had suddenly grown to twice their size.

[WayBack] Jeroen Pluimers en Twitter: “dat genereren kun je overigens zien als je dezelfde downloads doet, maar dan een fikse periode uit elkaar.…”

There are quite a few tools on [WayBack] Browse Internal PDF Structure – Super User and [WayBack] Best tool for inspecting PDF files? – Stack Overflow, including:

They also made me discover [WayBack] GitHub – pipwerks/PDFObject: A lightweight JavaScript utility for dynamically embedding PDFs in HTML documents documented at [WayBack] PDFObject: A JavaScript utility for embedding PDFs 

This particular case

The quickest way to analyse these for me was [WayBack] PDF Object Browser based on [WayBack] GitHub – brendandahl/pdf.js.utils: PDF.js Utility Files which is also the foundation of [WayBack] Test PDF Creator.

It runs in your web browser as local JavaScript, so it is pretty OK to load a PDF file into it: it does no “phone home”.

In this case, for generating PDF files with the same content, ABN AMRO added five Type 3 fonts of which one font was not used at all, and two others used to be Type 1 fonts.

Type 1 fonts (wikipedia)

Type 1 (also known as PostScriptPostScript Type 1PS1T1or Adobe Type 1) is the font format for single-byte digital fonts for use with Adobe Type Manager software and with PostScript printers. It can support font hinting.

It was originally a proprietary specification, but Adobe released the specification to third-party font manufacturers provided that all Type 1 fonts adhere to it.

Type 1 fonts are natively supported in Mac OS X, and in Windows 2000 and later via the GDI API.[2] (They are not supported in the Windows GDI+, WPF or DirectWrite APIs.)

Type 3 fonts (wikipedia)

Type 3 font (also known as PostScript Type 3 or PS3T3 or Adobe Type 3) consists of glyphs defined using the full PostScript language, rather than just a subset. Because of this, a Type 3 font can do some things that Type 1 fonts cannot do, such as specify shading, color, and fill patterns. However, it does not support hinting. Adobe Type Manager did not support Type 3 fonts, and they are not supported as native WYSIWYG fonts on any version of Mac OS or Windows.

So far for optimised PDF rendering…

Being in software development for this long, I am constantly reminded that The inmates are running the asylum – Wikipedia. I can definitely recommend reading “The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity”, by Alan Cooper:

–jeroen

Posted in Development, EPS/PostScript, PDF, Power User, Software Development | Leave a Comment »

ÜberPDF printing using a Delphi like canvas

Posted by jpluimers on 2019/11/15

For my link archive, as it has a lot of goodies in the comments, especially on how to avoid bitmaps in PDF emission: [WayBack] We took PDF to a whole new level today Load create, or editing a PDF in 2 lines of code using a simple Delphi (like) Canvas! We added a PDFPrinter and… – Joe C. Hecht – Google+

–jeroen

Posted in Delphi, Development, EPS/PostScript, PDF, Software Development | Leave a Comment »

Next up: TPDFPrinter and TPDFCanvas Expect a high paced (and easy) update cy…

Posted by jpluimers on 2019/11/05

For my link archive: [WayBack] Next up: TPDFPrinter and TPDFCanvas Expect a high paced (and easy) update cycle for Ultra, with a constant stream of new goodies! Zero hassles – it … – Joe C. Hecht – Google+

–jeroen

Posted in Delphi, Development, PDF, Power User, Software Development | Leave a Comment »

pandoc oneliner from reStructuredText to html

Posted by jpluimers on 2016/05/12

[WayBack] Pandoc is so versatile that you sometimes forget a conversion can be as simple as a one-liner:

pandoc -s README.rst -o readme.html

This converts the reStructured text in README.rst to html.

Pandoc is smart enough to recognise the conversions without you telling the formats with -f (input format) and -t (output format) explicitly.

If you do need to explicitly specify the format, it is useful to query which formats are supported as per [WayBack] Pandoc – Pandoc User’s Guide: specifying formats:

  • pandoc --list-input-formats
  • pandoc --list-output-formats

Read the rest of this entry »

Posted in Development, PDF, Power User, Scripting, Software Development | Leave a Comment »

 
%d bloggers like this: