The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,860 other subscribers

bash alias to decode email Quoted-Printable stdin data

Posted by jpluimers on 2025/09/25

Perl isn’t my strength, so I was glad to find the below links that inspired me to add this bash function to my profile decoding Quote-Printable email data (for instance used by sendmail and postfix to store SMTP message files):

# https://superuser.com/questions/1452249/fix-revert-wrong-encoding-of-file
function sendmail-decode-quoted-printable-from-stdin() {
  perl -0777 -ne 'use MIME::QuotedPrint; print decode_qp($_)'
}

From:

  • [Wayback/Archive] linux – Fix/revert wrong encoding of file – Super User (thanks [Wayback/Archive] wullxz and [Wayback/Archive] user1686)

    Q

    I have files of ISO-8859 encoded Text sent to me regularly by customers that contain debug information.
    Recently, they have started to turn up in my inbox as ASCII encoded with a few extra characters in the file as a result, breaking the parser I wrote for these files.

    A

    The shown encoding is Quoted-Printable, and is completely normal to see if you are looking at “raw” email source text – most non-ASCII messages and text attachments, and even some purely-ASCII ones, are encoded using QP (or even Base64).

    perl -0777 -ne 'use MIME::QuotedPrint; print decode_qp($_)' < wrong.txt > fixed.txt
  • [Wayback/Archive] How to decode quoted-printable file using command line on macOS? – Super User (thanks [Wayback/Archive] sunknudsen, [Wayback/Archive] glenn jackman, [Wayback/Archive] Jon Bailey and [Wayback/Archive] Archemar)

    Q

    I know it can be done using JavaScript or PHP, but is there a command line utility for that?

    A

    A perl one-liner:
    echo "$input" | perl -MMIME::QuotedPrint -0777 -nle 'print decode_qp($_)'
    
    MIME::QuotedPrint is a core perl module, so that should work without any additional installation.

    C

    a quick look at the python docs reveals [Wayback/Archive] docs.python.org/3/library/quopri.html

    A

    For Python 3.x: (Tested on Cygwin)

    python -c 'import sys,quopri;quopri.decode(sys.stdin,sys.stdout.buffer)' < infile > outfile

    C

    this seems to work also with Python 2.6.6 on RedHat 6.3. I use with python -c 'import quopri ; print(quopri.encodestring("'"${QUOI}"'",header=1)) ' (and output redirection in a file)

    A

    This is what I was looking for (strange example found on Wikipedia).
    $ brew install qprint
    $ echo "J'interdis aux marchands de vanter trop leurs marchandises. Car ils se font =
    vite p=C3=A9dagogues et t'enseignent comme but ce qui n'est par essence qu'=
    un moyen, et te trompant ainsi sur la route =C3=A0 suivre les voil=C3=A0 bi=
    ent=C3=B4t qui te d=C3=A9gradent, car si leur musique est vulgaire ils te f=
    abriquent pour te la vendre une =C3=A2me vulgaire." | qprint -d
    J'interdis aux marchands de vanter trop leurs marchandises. Car ils se font vite pédagogues et t'enseignent comme but ce qui n'est par essence qu'un moyen, et te trompant ainsi sur la route à suivre les voilà bientôt qui te dégradent, car si leur musique est vulgaire ils te fabriquent pour te la vendre une âme vulgaire.
    
  • [Wayback/Archive] email – How do I convert UTF-8 special characters in Bash? – Super User (thanks [Wayback/Archive] Markus, [Wayback/Archive] user1686 and [Wayback/Archive] blami)

    Q

    I am writing on a script that extracts and saves JPEG-attachements from emails and passes them to imagemagick. However, I am living in Germany and special characters in email text/subject as “ö”, “ä”, “ü” and “ß” are pretty common.
    I am extracting the subject with formail:
        SUBJECT=$(formail -zxSubject: <"$file")
    
    and that results in:
    • =?UTF-8?Q?Meine_G=c3=bcte?=
    (“Meine Güte“) or even worse
    • =?UTF-8?B?U2Now7ZuZSBHcsO8w59lIQ==?=
    (“Schöne Grüße!“).
    I try to use part of the subject as a filename and as imagemagick text annotation, which obviously doesn’t work.
    How do I convert this UTF-8 text to text with special characters in bash?

    A

    Your input, instead, is MIME (RFC 2047) encoded UTF-8. The “Q” marks Quoted-Printable mode, and “B” marks Base64 mode. Among others, Perl’s Encode::MIME::Header can be used to decode both:
    #!/usr/bin/env perl
    use open qw(:std :utf8);
    use Encode qw(decode);
    
    while (my $line = <STDIN>) {
            print decode("MIME-Header", $line);
    }
    
    Oneliner (see perldoc perlrun for explanation):
    perl -CS -MEncode -ne 'print decode("MIME-Header", $_)'
    
    This can take any format as input:
    $ echo "Subject: =?UTF-8?Q?Meine_G=c3=bcte?=, \
                     =?UTF-8?B?U2Now7ZuZSBHcsO8w59lIQ==?=" | perl ./decode.pl
    Subject: Meine Güte, Schöne Grüße!
    
    A version in Python 3:
    #!/usr/bin/env python3
    import email.header, sys
    
    words = email.header.decode_header(sys.stdin.read())
    words = [s.decode(c or "utf-8") for (s, c) in words]
    print("".join(words))

    A

    E-mail subject itself is header and headers must contain only ASCII characters. This is why UTF-8 (or any other non-ASCII charset) subject must be encoded.
    This way of encoding non-ASCII characters in to ASCII is described in RFC 1342.
    Basically, encoded subject has (as you’ve already listed in your examples) following format:
    =?charset?encoding?encoded-text?=
    
    Based on encoding value is encoded-text decoded either as quoted-printable (Q) or as base64 (B).
    To get human readable form you need to pass encoded-text portion of subject header value to program that decode it. I believe there are some standalone commands to do that (uudecode), but I prefer to use Perl one-liners:
    For quoted-printable:
    perl -pe 'use MIME::QuotedPrint; $_=MIME::QuotedPrint::decode($_);'
    
    and for base64:
    perl -pe 'use MIME::Base64; $_=MIME::Base64::decode($_);'
    
    Be sure you pass only encoded-text portion and not whole subject header value.
  • [Wayback/Archive] Quoted Printable encode/decode bash aliases – suitable for pipelining (thanks [Wayback/Archive] jjarmoc, [Wayback/Archive] klepsydra, [Wayback/Archive] Hubro and [Wayback/Archive] eiro’s gists)

    S

    # To decode:
    #   qp -d string 
    # To encode:
    #   qp string
    
    alias qpd='perl -MMIME::QuotedPrint -pe '\''$_=MIME::QuotedPrint::decode($_);'\'''
    alias qpe='perl -MMIME::QuotedPrint -pe '\''$_=MIME::QuotedPrint::encode($_);'\'''
    function qp {
    if [[ "$1" = "-d" ]]
    then
        echo ${@:2} | qpd
    else
        echo ${@} | qpe
    fi
    }

    C

    Don’t forget to “shopt -s expand_aliases” if you plan to use these aliases in bash scripts

    C

    Alternatively, use Python’s quopri module:

    $ echo "Jeg liker å sykle" | python -m quopri
    Jeg liker =C3=A5 sykle
    
    $ echo "Jeg liker =C3=A5 sykle" | python -m quopri -d
    Jeg liker å sykle

Related:

  • blog post email file decoding: Encode/Decode Quoted Printable – Webatic
  • definition [Wayback/Archive] RFC 2045: Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, section 6.7: Quoted-Printable Content-Transfer-Encoding
  • [Wayback/Archive] linux decode “quoted printable” “stdin” – Google Search
  • [Wayback/Archive] linux decode “quoted printable” “stdin” site:superuser.com – Google Search
  • [Wayback/Archive] manual page site:www.perl.com – Google Search
  • Quoted-printable – Wikipedia
  • [Wayback/Archive] MIME::QuotedPrint – Encoding and decoding of quoted-printable strings – metacpan.org
  • [Wayback/Archive] QPRINT: Encode and Decode Quoted-Printable Files
  • [Wayback/Archive] Perl Command-Line Options
    • [Wayback/Archive] 5.30.0 Documentation – Perl Language
    • [Wayback/Archive] perl – The Perl 5 language interpreter – Perldoc Browser

      SYNOPSIS

      perl [ -sTtuUWX ] [ -hv ] [ -V[:configvar] ] [ -cw ] [ -d[t][:debugger] ] [ -D[number/list] ] [ -pna ] [ -Fpattern ] [ -l[octal] ] [ -0[octal/hexadecimal] ] [ -Idir ] [ -m[]module ] [ -M[]‘module…’ ] [ -f ] [ -C [number/list] [ -S ] [ -x[dir] ] [ -i[extension] ] [ [-e|-E‘command’ ] [  ] [ programfile ] [ argument ]…
      For more information on these options, you can run perldoc perlrun.
    • [Wayback/Archive] perlrun – how to execute the Perl interpreter – Perldoc Browser
      • -0[octal/hexadecimal]
        specifies the input record separator ($/) as an octal or hexadecimal number. If there are no digits, the null character is the separator. Other switches may precede or follow the digits. For example, if you have a version of find which can print filenames terminated by the null character, you can say this:
        find . -name '*.orig' -print0 | perl -n0e unlink
        The special value 00 will cause Perl to slurp files in paragraph mode.
        Any value 0400 or above will cause Perl to slurp files whole, but by convention the value 0777 is the one normally used for this purpose. The “-g” flag is a simpler alias for it.
        You can also specify the separator character using hexadecimal notation: -0xHHH…, where the H are valid hexadecimal digits. Unlike the octal form, this one may be used to specify any Unicode character, even those beyond 0xFF. So if you really want a record separator of 0777, specify it as -0x1FF. (This means that you cannot use the “-x” option with a directory name that consists of hexadecimal digits, or else Perl will think you have specified a hex number to -0.)
      • -e commandline
        may be used to enter one line of program. If -e is given, Perl will not look for a filename in the argument list. Multiple -e commands may be given to build up a multi-line script. Make sure to use semicolons where you would in a normal program.
      • -g
        undefines the input record separator ($/) and thus enables the slurp mode. In other words, it causes Perl to read whole files at once, instead of line by line.
        This flag is a simpler alias for -0777.
        Mnemonics: gobble, grab, gulp.
      • -l[octnum]
        enables automatic line-ending processing. It has two separate effects. First, it automatically chomps $/ (the input record separator) when used with “-n” or “-p”. Second, it assigns $\ (the output record separator) to have the value of octnum so that any print statements will have that separator added back on. If octnum is omitted, sets $\ to the current value of $/. For instance, to trim lines to 80 columns:
        perl -lpe 'substr($_, 80) = ""'
        Note that the assignment $\ = $/ is done when the switch is processed, so the input record separator can be different than the output record separator if the -l switch is followed by a -0 switch:
        gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
        This sets $\ to newline and then sets $/ to the null character.
      • -m[]module
        -M[]module
        -M[]‘module …’
        -[mM][]module=arg[,arg]…

        -mmodule
        executes use module (); before executing your program. This loads the module, but does not call its import method, so does not import subroutines and does not give effect to a pragma.
        -Mmodule executes use module ; before executing your program. This loads the module and calls its import method, causing the module to have its default effect, typically importing subroutines or giving effect to a pragma. You can use quotes to add extra code after the module name, e.g., '-MMODULE qw(foo bar)'.
        If the first character after the -M or -m is a dash () then the ‘use’ is replaced with ‘no’. This makes no difference for -m.
        A little builtin syntactic sugar means you can also say -mMODULE=foo,bar or -MMODULE=foo,bar as a shortcut for ‘-MMODULE qw(foo bar)’. This avoids the need to use quotes when importing symbols. The actual code generated by -MMODULE=foo,bar is use module split(/,/,q{foo,bar}). Note that the = form removes the distinction between -m and -M; that is, -mMODULE=foo,bar is the same as -MMODULE=foo,bar.
        A consequence of the split formulation is that -MMODULE=number never does a version check, unless MODULE::import() itself is set up to do a version check, which could happen for example if MODULE inherits from Exporter.
      • -n
        causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed -n or awk:
          LINE:
            while (<>) {
            ...     # your program goes here
            }
        Note that the lines are not printed by default. See “-p” to have lines printed. If a file named by an argument cannot be opened for some reason, Perl warns you about it and moves on to the next file.
        Also note that <> passes command line arguments to “open” in perlfunc, which doesn’t necessarily interpret them as file names. See perlop for possible security implications.
        Here is an efficient way to delete all files that haven’t been modified for at least a week:
        find . -mtime +7 -print | perl -nle unlink
        This is faster than using the -exec switch of find because you don’t have to start a process on every filename found (but it’s not faster than using the -delete switch available in newer versions of find. It does suffer from the bug of mishandling newlines in pathnames, which you can fix if you follow the example under -0.
        BEGIN and END blocks may be used to capture control before or after the implicit program loop, just as in awk.

--jeroen

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.