The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 1,839 other subscribers

Some notes on the The Internet Archive command-line interface (CLI) and Python API tools

Posted by jpluimers on 2026/05/20

Few people realise that in addition to uploading through the Internet Archive web user interface at [Wayback/Archive] Upload to Internet Archive, you can also upload through the command-line.

Fewer people know using the CLI is more reliable, as the web UI often has trouble with recovery from upload interruptions (but it is better than the Wayback Machine archiver which seems to have no recovery options at all).

Jason Scott responded to a really rude comment of an uploader, but the Internet Archive does not really advocate the CLI uploader much. I added a comment, but doubt that has changed: it regrettably is an organisation that has a track record of being quite reluctant to publicly show improvement.

The whole thread is at [Wayback/Archive] Post by @textfiles.com — Bluesky: Regarding large-size uploads, especially over, let’s say, 750mb to the Internet Archive, I highly suggest the ia command-line client, which has a separate pathway not dealing with weird browser oddities and behavior.… and archived for posterity.

TL;DR

  • you can batch upload to the base Internet Archive
  • you cannot  batch upload to the Wayback Machine of the internet archive

First, a list on how you can find the CLI tooling the normal way:

  1. Go to [Wayback/Archive] Upload to Internet Archive which shows you you need to logon

    [Wayback/Archive] 536230068-dc8b74ea-cdaf-44ca-a2e8-5a14e168de35.png (1024×768)

  2. Logon (if not yet done, create an account first)
  3. Go back to [Wayback/Archive] Upload to Internet Archive which now shows you an upload page, and two links in addition to the menu already visible on the non-logged-on version of the page:

    [Wayback/Archive] 536229699-4e86b2b1-0038-44e8-a0a4-e376c2e61da4.png (1024×648)

    1. Click here to use the LMA uploader instead. -> [Wayback/Archive] Internet Archive: please login or join us (or after login Internet Archive: Create item – note there is no mentioning of “LMA” or “Live music archive uploader” in the page title)
    2. Instructions on how to preset metadata -> [Wayback/Archive] Presetting metadata with the new Beta Uploader | Internet Archive Blogs (which does not mention the current uploader is what the beta uploader used to be)
  4. Click “Help” to get to [Wayback/Archive] Internet Archive Help Center – How can we help you?
  5. On the left side, browse down and click on “Uploading” to get to [Wayback/Archive] Uploading – Internet Archive Help Center which has 5 sub-pages:
    1. “Uploading – A Basic Guide” -> [Wayback/Archive] Uploading – A Basic Guide – Internet Archive Help Center
    2. “Example of good metadata for items” ->[Wayback/Archive] Example of good metadata for items – Internet Archive Help Center references[Archive] The Great Gatsby : F. Scott Fitzgerald (1896 – 1940) : Free Download, Borrow, and Streaming : Internet Archive as an example of good metadata.
    3. “Uploading – Troubleshooting” -> [Wayback/Archive] Uploading – Troubleshooting – Internet Archive Help Center which has this useful section:

      Uploading limit

      A single file in an item should not be larger than ~500 to ~700 GB (capacity depends on activity levels and other circumstances, so we recommend using the lower limit of 500GB).

      We recommend creating items that have less than 10,000 files, and where the files do not exceed 1TB. While 10,000 files is not a hard limit (the API will allow up to 250,000 files), exceeding that number is likely to disrupt your experience of the item. You may have trouble viewing the files, trouble editing the item, etc.

    4. “Uploading – Tips” -> [Wayback/Archive] Uploading – Tips – Internet Archive Help Center also has a bulk uploader section:

      Any tips that will make bulk uploads quicker?

      Absolutely, we recommend using our Internet Archive Command-Line Tool. It does, however, require that you are quite comfortable or familiar with Unix.

      You can download the tool from GitHub located by clicking on this link: Internet Archive Command-Line Tool.

    5. “Uploading – What is ok or not ok to upload?” -> [Wayback/Archive] Uploading – What is not ok or not ok to upload? – Internet Archive Help Center which has this useful section:

      Bulk uploading – I want to upload a lot of materials

      We would not discourage bulk upload so long as the materials meet the above criteria as well as these:
      • Items should be thematically cohesive

      • Items are organized efficiently:

      • Serial (magazines), volume (multi-volume works), or chapter/track (albums, cds, etc.) are uploaded to a single item rather than many items
      • Uploaded files to items do not exceed either 500 files or 500GB of data
      • Large numbers of files that deserve to be together are zipped and you upload the zip file(s)
      • Uploads do not exceed 5,000 files per day (regardless of the number of items that are created.) In this context, a zip file is considered one file
      • Uploads do not exceed TK of data per day.
  6. Click “Uploading – A Basic Guide” to get to [Wayback/Archive] Uploading – A Basic Guide – Internet Archive Help Center
  7. Browse down to [Wayback/Archive] Uploading – A Basic Guide – Internet Archive Help Center: Can I batch or bulk upload files?
  8. Click on “Internet Archive Command-line Tool.” to get to the almost empty page [Wayback/Archive] The Internet Archive Python Library — internetarchive 1.8.1 documentation
  9. Click on “https://archive.org/services/docs/api/internetarchive” or wait 5 seconds to get to the actual information page [Wayback/Archive] The Internet Archive Python Library — Internet Archive Developer Portal.

I never found where the Internet Archive open source repository is, or how to submit issues, so that’s the reason for [Wayback/Archive] Post by @wiert.bsky.social — Bluesky

Thanks. Found it: help.archive.org/help/uploading-a-basic-guide/#can-i-batch-or-bulk-upload-files It would be nice if there was a link to that from archive.org/upload

That already has a link to “Instructions on how to preset metadata”. Also mentioning that large uploads work more reliably with the bach uploader would be a great addition.

The CLI uploader

Note that the above is the hard way of finding it. The easy was was [Wayback/Archive] Post by @skybrina.org — Bluesky

It’s in the upload guide here under “Can I batch or bulk upload files?”. help.archive.org/help/uploading-a-basic-guide

In the actual section “Can I batch or bulk upload files?” at [Wayback/Archive] Uploading – A Basic Guide – Internet Archive Help Center: Can I batch or bulk upload files? is a link to a now almost empty page:

Yes, you can. But, due to the time it takes to bulk upload files, we recommend using the Internet Archive Command-Line Tool to script your upload.
If you would like to go and find more information about the tool, click this link: Internet Archive Command-line Tool. It does, however, require you to be quite comfortable with Unix.

The actual documentation of the Internet Archive CLI tool has moved to [Wayback/Archive] The Internet Archive Python Library — Internet Archive Developer Portal

Welcome to the documentation for the internetarchive Python library. This tool provides both a command-line interface (CLI) and a Python API for interacting with archive.org, allowing you to search, download, upload and interact with archive.org services from your terminal or in Python.

Installation instructions are at [Wayback/Archive] Installation — Internet Archive Developer Portal.

Be sure to run it in a venv (recent python versions) or virtualenv (older python versions). The documentation only mentions virtualenv, and should add venv.

Summary of Internet Archive web site issues

The above is just an example how tough it is to find your way around how to use the Internet Archive. It is way more widespread, but in essence comes down to:

  • page titles do not match page meaning
  • pages lack context sensitive help
  • links point to outdated information
  • links to relevant information is missing
  • inconsistent formatting across pages

Related blog posts

How To Scale and Crop Images with CSS object-fit | DigitalOcean

--jeroen

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.