The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,854 other subscribers

Archive for the ‘Python’ Category

GitHub – pastpages/savepagenow: A simple Python wrapper for archive.org’s “Save Page Now” capturing service

Posted by jpluimers on 2021/03/11

This makes it way easier to save WayBack content:

[WayBack] GitHub – pastpages/savepagenow: A simple Python wrapper for archive.org’s “Save Page Now” capturing service

A poor-mans alternative is the below bash script from [WayBack] Saving of public Google+ content at the Internet Archive’s Wayback Machine by the Archive Team has begun : plexodus:

For Linux, MacOS / OSX, BSD, and other Unix-like operating systems (including Android with Termux, or Windows, with a Unix/Linux environment), the following script (I’ve saved this as archive-url) will archive the requested URL:

#!/bin/bash
# archive-url
# Archive selected URL at the Internet Archive

curl -s -I -H "Accept: application/json" "https://web.archive.org/save/${1}" |
grep '^x-cache-key:' | sed "s,https,&://,; s,\(${1}\).*$,\1,"

Save that to your execution path (I’ve chosen ~/bin, you might use /usr/local/bin or another location on your $PATH, and invoke as, say (again referring to the G+MM homepage):

$ archive-url https://plus.google.com/communities/112164273001338979772

If you have a list of URLs in a file (or pipelined from command output), you can request all of them to be archived in a simple bash loop. I’m using xargs here to run ten simultaneous requests from the file gplus-urllist:

cat gplus_urllist | while read url do xargs -I{} -P 10 archive-url {}; done

I’ve run this on over 10,000 URLs over a modest residential broadband connection in a hair over two hours.

Note that such requests trigger an archive by the Internet Archive from one of its archiving nodes, you’re not sending the page to the Archive yourself. In particular, archival from regions defaulting to another language may result in the Google+ site content (but not post or comments) being in a different language. I’ve frequently seen my pages turning up in Japanese, for instance.

–jeroen

Posted in bash, Development, Python, Scripting, Software Development | Leave a Comment »

Python: saving a web page to a jpeg image file by using the Google base64url encoded screenshot of it

Posted by jpluimers on 2021/02/19

As a follow-up on Still looking for base64url decoding tools, both on-line and for MacOS homebrew: this is in Python, works on MacOS, Linux and Windows, and can be integrated in a web page.

It is based on the ideas in [WayBack] Python-Twitter-Hacks/websiteScreenshot.py at master · edent/Python-Twitter-Hacks · GitHub, which was more like a code snippet with hard coded literals.

It downloads a jpeg web-site screenshot using the Google PageSpeed API V1, which generates the screenshot as a base64url encoded blob inside a JSON structure.

Python does not have native Python base64url support, but the concept of it is fairly straightforward: [WayBack] RFC 4648 – The Base16, Base32, and Base64 Data Encodings: Base 64 Encoding with URL and Filename Safe Alphabet, which allows data to be passed inside URLs without reverting to [WayBack] Percent-encoding – Wikipedia.

My changes work, but are by no means in canonical form or Idiomatic Python. I have a long way to go to reach that level of Python.

So I forked the repository, and fixed the script basing it on Python 3.

I might make it V2 compatible in the future. More information on V2 in [WayBack] Google APIs Explorer: Services > PageSpeed Insights API v2 > pagespeedonline.pagespeedapi.runpagespeed

Content is in the below gist.

–jeroen

Read the rest of this entry »

Posted in base64, base64url, Development, Encoding, Python, Scripting, Software Development | Leave a Comment »

Making it dead simple to implement @haveibeenpwnd in your applications, including strength warning if found in @troyhunt’s password collection.

Posted by jpluimers on 2020/12/02

I wasn’t aware that Troy Hunt created an API [WayBack] for [WayBack] Have I Been Pwned: Check if your email has been compromised in a data breach.

He did, as I noticed through [WayBack] Michelangelo van Dam on Twitter: “Making it dead simple to implement @haveibeenpwnd in my applications, including strength warning if found in @troyhunt’s password collection. Check out to try it out yourself. #ImproveSecurity #haveibeenpwnd”.

There are in fact plenty of other packages, web-sites and apps using the API as seen on [WayBack] Have I Been Pwned: API consumers.

Many people ask “if it is safe” (often assuming passwords are sent in clear, or hashes are sent in full; my fear is that those people implement security somewhere).

It is safe:

PHP source is at [WayBack] GitHub – DragonBe/hibp: A composer package to verify if a password was previously used in a breach using Have I Been Pwned API.

There is also a [WayBack] composer package at [WayBack] dragonbe/hibp – Packagist.

A really cool thing on it is this:

This project was also the subject of my talk [WayBack] Mutation Testing with Infection where the code base was not only covered by unit tests, but also was subjected to Mutation Testing using [WayBack] Infection to ensure no coding mistakes could slip into the codebase.

Apart from the tests, the most important source is at [WayBack] hibp/Hibp.php at master · DragonBe/hibp · GitHub

Related:

–jeroen

Posted in Development, Mobile Development, PHP, Python, Scripting, Software Development, Web Development | Leave a Comment »

Brew reminder to self

Posted by jpluimers on 2020/08/05

From the update process:

==> Caveats
==> hub
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d

zsh completions have been installed to:
  /usr/local/share/zsh/site-functions
==> python
Python has been installed as
  /usr/local/bin/python3

Unversioned symlinks `python`, `python-config`, `pip` etc. pointing to
`python3`, `python3-config`, `pip3` etc., respectively, have been installed into
  /usr/local/opt/python/libexec/bin

If you need Homebrew's Python 2.7 run
  brew install python@2

You can install Python packages with
  pip3 install 
They will install into the site-package directory
  /usr/local/lib/python3.7/site-packages

See: https://docs.brew.sh/Homebrew-and-Python
==> youtube-dl
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d

zsh completions have been installed to:
  /usr/local/share/zsh/site-functions
==> mpv
zsh completions have been installed to:
  /usr/local/share/zsh/site-functions
==> node
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d

–jeroen

Posted in Apple, Development, Home brew / homebrew, Power User, Python, Scripting, Software Development | Leave a Comment »

pip install –user and your path

Posted by jpluimers on 2020/06/09

I’ve added this to my ~/.bashrc to stuff installed by pip install --user is accessible from interactive shells:

# set PATH so it includes user's private python "pip --user" bin if it exists
if [ -d "$HOME/.local/bin" ] ; then
    PATH="$PATH:$HOME/.local/bin"
fi

The addition is at the end of the path. It is a choice: it means machine installs take prevalence over user installs. That’s usually what I want. For more considerations (including non-interactive shells), see [WayBack] bash – How to correctly add a path to PATH? – Unix & Linux Stack Exchange.

The --user installs do not affect the full system, nor other users.

Further reading:

–jeroen

Posted in Development, Python, Scripting, Software Development | Leave a Comment »

MultiBootUSB

Posted by jpluimers on 2020/05/07

Cool tool:

MultiBootUSB is a cross platform software written in python which allows you to install multiple live linux on a USB disk non destructively and option to uninstall distros. Try out the world’s first true cross platform multi boot live usb creator for free. Download Now!

Information and downloads on [WayBackMultiBootUSB.

There are actually a few repositories within [WayBack] mbusb (multibootusb) · GitHub of which one has a ruby implementation as well.

A more elaborate article is on [WayBack] How to Install Multiple Linux Distributions on One USB, but the site should get you going just fine.

Via: [WayBack] Multiple Linux distributions on one UBS stick. I just tried it with: * CloneZilla * Lubuntu * LiteLinux The tool they describe – MultiBootUSB – comes w… – Thomas Mueller (dummzeuch) – Google+

–jeroen

Posted in *nix, *nix-tools, Development, Hardware, Linux, Power User, Python, Software Development, USB | Leave a Comment »

GitHub – ofek/hatch: A modern project, package, and virtual env manager for Python

Posted by jpluimers on 2020/05/04

Cool: [WayBack] GitHub – ofek/hatch: A modern project, package, and virtual env manager for Python

Via: [WayBack] Hatch: A modern project, package, and virtual env manager for Python – ThisIsWhyICode – Google+

–jeroen

Posted in Development, Python, Scripting, Software Development | Leave a Comment »

Migrating to python 3: adding parenthesis to print calls and getting rid of printf style formatting

Posted by jpluimers on 2020/04/28

There is still so much Python 2.x stuff on the web, and I’m slowly moving what I have to Python 3.

These links are good starts for print calls and string formatting:

–jeroen

Posted in Development, Python, Scripting, Software Development | Leave a Comment »

GitHub – pyscripter/pyscripter: Pyscripter is a feature-rich but lightweight Python IDE

Posted by jpluimers on 2020/04/28

Just in case I ever need to develop Python scripts on Windows (nowadays it’s mostly on Linux/BSD based systems):[WayBack] GitHub – pyscripter/pyscripter: Pyscripter is a feature-rich but lightweight Python IDE.

If you like that, you can (also) help with this project: [WayBack] PyScripter localization Translate PyScripter into your own language.

Via: [WayBack] The PyScripter IDE, which is written in Delphi is looking for translators. We have set up a translation project on transifex.com and would be happy if s… – Lübbe Onken – Google+

–jeroen

Posted in Delphi, Development, Python, Scripting, Software Development | Leave a Comment »

Common SMTP message size limits

Posted by jpluimers on 2020/04/08

After a 2018 discussion with a “zorgkantoor” (Dutch for office that arranges for special long term health care needs, successor of AWBZ) about their very low (10 megabyte) SMTP message size limit – even though they expect scanned PDF documents.

Their web-care team posed this limit as normal, so I made a list of limits in their peer group, common world-wide and well-ranked Dutch internet providers.

My plan is to check the progression of these limits over time.

Note these are the bruto message sizes including encoded attachments. Since encoding in [WayBack] MIME Base64 – Wikipedia has a overhead of at least 37% (encoded size is at least 1.37 the original size), the unencoded maximum size is less than 73% of what is listed below.

References:

2018

Read the rest of this entry »

Posted in base64, Communications Development, Development, eMail, Encoding, Internet protocol suite, MIME, Power User, Python, Scripting, SMTP, SocialMedia, Software Development, TCP | Leave a Comment »