The Wiert Corner – irregular stream of stuff

Jeroen W. Pluimers on .NET, C#, Delphi, databases, and personal interests

  • My badges

  • Twitter Updates

  • My Flickr Stream

  • Pages

  • All categories

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 4,225 other subscribers

Running ArchiveTeam Warrior version 3.2 on ESXi

Posted by jpluimers on 2021/05/05

A while ago I wrote about Helping the WayBack ArchiveTeam team: running their Warrior virtual appliance on ESXi.

Since it was scheduled before my cancer treatment started and got posted when still recovering from it, I missed that version 3.2 of the [Wayback] ArchiveTeam Warrior appliance appeared in the [Wayback] Releases · ArchiveTeam/Ubuntu-Warrior at [Wayback] Release v3.2 · ArchiveTeam/Ubuntu-Warrior. You can download it form these places:

These two sites have not yet been updated, so they contain the older versions:

The source code now has been moved three times:

  1. [Wayback] ArchiveTeam/warrior-code
  2. [Wayback] ArchiveTeam/warrior-code2 · GitHub
  3. [Wayback] ArchiveTeam/Ubuntu-Warrior at master (this is version 3 and up)

The docker container

The new version of Archive Team Warrior now is basically a shell around [Wayback] Watchtower and the [Wayback] ArchiveTeam/warrior-dockerfile: A Dockerfile for the ArchiveTeam Warrior docker container. This makes updating the core way easier.

More on the docker container (in case you want to run it yourself) is at [Wayback] ArchiveTeam Warrior – Archiveteam – Installing and running with Docker:

You’ll need Docker (open source) and the Warrior Docker image.

  1. Download Docker from the link above and install it.
  2. Open your terminal. On Windows, you can use either Command Prompt (CMD) or PowerShell. On macOS and Linux you can use Terminal (Bash).
  3. Use the following command to start the Warrior as well as Watchtower, which will automatically keep your Warrior updated:
    docker run --detach --name watchtower --restart=on-failure --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --label-enable --cleanup --interval 3600 && docker run --detach --name archiveteam-warrior --label=com.centurylinklabs.watchtower.enable=true --restart=on-failure --publish 8001:8001 atdr.meo.ws/archiveteam/warrior-dockerfile

    (For a full explanation of this command, see items 3 and 4 here.)

  4. Using your regular web browser, visit http://localhost:8001/.

Update 20220326: The actual code running

The running code is spread over a few repositories which are explained in [Wayback/Archive] Dev/Source Code – Archiveteam and underlying project repositories (which you can select in the Web-UI) are at [Wayback/Archive] Archive Team: GitHub repositories (search for repositories ending in -grab, for instance when you look for the project titled YouTube into the web-UI html, the project named youtube is youtube-grab on GitHub).

At the time of updating on the VM sometimes only the urlteam2 projects seems to work, I am not sure why.

Other projects generate an error like this:

File "/usr/local/lib/python3.9/site-packages/seesaw/warrior.py", line 736, in start_selected_project

There are not really helpful results from [Wayback/Archive] File “/usr/local/lib/python3.9/site-packages/seesaw/warrior.py”, line 736, in start_selected_project – Google Search.

The workaround on my systems is first selecting urltream2, then any of the other projects and waiting like an hour solves this. It might have to do with incompatibilities between the VM and some projects as suggested in [Wayback/Archive] Warrior ‘Current Project’ page blank · Issue #102 · ArchiveTeam/seesaw-kit.

The virtual appliance

The virtual appliance is released as virtual appliance aimed by default at VirtualBox and steps to run with VMware: [Wayback] ArchiveTeam Warrior – Archiveteam.

Totally agreeing with Kristian Kohntopp, I do not understand why people use VirtualBox at all: I just run in too much issues like [Archive.is] Kristian Köhntopp on Twitter: “Hint: Wenn die Installation einer Linux-Distro in Virtualbox mit wechselnden, unbekannten Fehlern scheitert, hilft es, stattdessen einmal VMware Workstation oder kvm zu probieren. In meinem Fall hat es dann jedes einzelne Mal mit demselben Iso geklappt.”.

Inspecting the .ova file, which is basically a tar compressed file consisting of an OVF directory as per Open Virtualization Format:Design – Wikipedia

The entire directory can be distributed as an Open Virtual Appliance (OVA) package, which is a tar archive file with the OVF directory inside.

Inspecting the disk image inside the directory learned me that pure one-file binary VMDK disk images start with a KMDV signature in big-endian and KDMV in little-endian (first four bytes are 4b 44 4d 56). More on the VMDK file format can be found in these links (all via [Wayback] vmdk file format specification – Google Search):

So here are some steps to get the .ova image to run on ESXi. I think it should work for ESXI 5.1 and up, but I have tested only on ESXi 6.7:

So what I did was (in short) decompress the .ova file into a directory, copied the .ovf file, tested the modification, then generated a patch using diff (diff -u archiveteam-warrior-v3.2-20210306.ovf archiveteam-warrior-v3.2-20210306.ESXi.ovf > archiveteam-warrior-v3.2-20210306.ovf.patch) and stored it in the patch gist (so the [Wayback] fork connection with the [Wayback] original by Kipari was maintained) as archiveteam-warrior-v3.2-20210306.ovf.patch.

Then I notified the team in a [Wayback] comment on issue 5 that there is a new patch file.

Finally I uploaded the steps and patched .ova, .ovf, .vmdk, sha256 and documentation files to [Wayback] wiert.me / public / ova / archiveteam-warrior-v3.2-20210306.ESXi · GitLab

Patching https://warriorhq.archiveteam.org/downloads/warrior3/archiveteam-warrior-v3.2-20210306.ova for ESXi 5.1 and up

–jeroen


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

 
%d bloggers like this: