Running ArchiveTeam Warrior version 3.2 on ESXi
Posted by jpluimers on 2021/05/05
A while ago I wrote about Helping the WayBack ArchiveTeam team: running their Warrior virtual appliance on ESXi.
Since it was scheduled before my cancer treatment started and got posted when still recovering from it, I missed that version 3.2 of the [Wayback] ArchiveTeam Warrior appliance appeared in the [Wayback] Releases · ArchiveTeam/Ubuntu-Warrior at [Wayback] Release v3.2 · ArchiveTeam/Ubuntu-Warrior. You can download it form these places:
These two sites have not yet been updated, so they contain the older versions:
- [Wayback] Index of /downloads/warrior3/
- [Archive.is] Internet Archive Search: title:(archiveteam-warrior)
The source code now has been moved three times:
- [Wayback] ArchiveTeam/warrior-code
- [Wayback] ArchiveTeam/warrior-code2 · GitHub
- [Wayback] ArchiveTeam/Ubuntu-Warrior at master (this is version 3 and up)
The docker container
The new version of Archive Team Warrior now is basically a shell around [Wayback] Watchtower and the [Wayback] ArchiveTeam/warrior-dockerfile: A Dockerfile for the ArchiveTeam Warrior docker container. This makes updating the core way easier.
More on the docker container (in case you want to run it yourself) is at [Wayback] ArchiveTeam Warrior – Archiveteam – Installing and running with Docker:
You’ll need Docker (open source) and the Warrior Docker image.
- Download Docker from the link above and install it.
- Open your terminal. On Windows, you can use either Command Prompt (CMD) or PowerShell. On macOS and Linux you can use Terminal (Bash).
- Use the following command to start the Warrior as well as Watchtower, which will automatically keep your Warrior updated:
docker run --detach --name watchtower --restart=on-failure --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --label-enable --cleanup --interval 3600 && docker run --detach --name archiveteam-warrior --label=com.centurylinklabs.watchtower.enable=true --restart=on-failure --publish 8001:8001 atdr.meo.ws/archiveteam/warrior-dockerfile(For a full explanation of this command, see items 3 and 4 here.)
- Using your regular web browser, visit http://localhost:8001/.
Update 20220326: The actual code running
The running code is spread over a few repositories which are explained in [Wayback/Archive] Dev/Source Code – Archiveteam and underlying project repositories (which you can select in the Web-UI) are at [Wayback/Archive] Archive Team: GitHub repositories (search for repositories ending in -grab
, for instance when you look for the project titled YouTube
into the web-UI html, the project named youtube
is youtube-grab
on GitHub).
At the time of updating on the VM sometimes only the urlteam2
projects seems to work, I am not sure why.
Other projects generate an error like this:
File "/usr/local/lib/python3.9/site-packages/seesaw/warrior.py", line 736, in start_selected_project
There are not really helpful results from [Wayback/Archive] File “/usr/local/lib/python3.9/site-packages/seesaw/warrior.py”, line 736, in start_selected_project – Google Search.
The workaround on my systems is first selecting urltream2
, then any of the other projects and waiting like an hour solves this. It might have to do with incompatibilities between the VM and some projects as suggested in [Wayback/Archive] Warrior ‘Current Project’ page blank · Issue #102 · ArchiveTeam/seesaw-kit.
The virtual appliance
The virtual appliance is released as virtual appliance aimed by default at VirtualBox and steps to run with VMware: [Wayback] ArchiveTeam Warrior – Archiveteam.
Totally agreeing with Kristian Kohntopp, I do not understand why people use VirtualBox at all: I just run in too much issues like [Archive.is] Kristian Köhntopp on Twitter: “Hint: Wenn die Installation einer Linux-Distro in Virtualbox mit wechselnden, unbekannten Fehlern scheitert, hilft es, stattdessen einmal VMware Workstation oder kvm zu probieren. In meinem Fall hat es dann jedes einzelne Mal mit demselben Iso geklappt.”.
Inspecting the .ova file, which is basically a tar
compressed file consisting of an OVF directory as per Open Virtualization Format:Design – Wikipedia
The entire directory can be distributed as an Open Virtual Appliance (OVA) package, which is a tar archive file with the OVF directory inside.
Inspecting the disk image inside the directory learned me that pure one-file binary VMDK disk images start with a KMDV
signature in big-endian and KDMV
in little-endian (first four bytes are 4b 44 4d 56
). More on the VMDK file format can be found in these links (all via [Wayback] vmdk file format specification – Google Search):
- [Wayback] 0001266: A VMDK file restored using localvmdk=yes is not recognized as a virtual disk. – Bareos Bug Tracker
- [Wayback] vmware – VMDK descriptor file format – Stack Overflow
- [Wayback] www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf
- [Wayback: VMware Virtual Disks: Virtual Disk Format 1.1]
- [Wayback] libvmdk/VMWare Virtual Disk Format (VMDK).asciidoc at main · libyal/libvmdk
- [Wayback] VMDK-Handbook-Basics
So here are some steps to get the .ova image to run on ESXi. I think it should work for ESXI 5.1 and up, but I have tested only on ESXi 6.7:
So what I did was (in short) decompress the .ova file into a directory, copied the .ovf file, tested the modification, then generated a patch using diff (diff -u archiveteam-warrior-v3.2-20210306.ovf archiveteam-warrior-v3.2-20210306.ESXi.ovf > archiveteam-warrior-v3.2-20210306.ovf.patch
) and stored it in the patch gist (so the [Wayback] fork connection with the [Wayback] original by Kipari was maintained) as archiveteam-warrior-v3.2-20210306.ovf.patch.
Then I notified the team in a [Wayback] comment on issue 5 that there is a new patch file.
Finally I uploaded the steps and patched .ova
, .ovf
, .vmdk
, sha256 and documentation files to [Wayback] wiert.me / public / ova / archiveteam-warrior-v3.2-20210306.ESXi · GitLab
Patching https://warriorhq.archiveteam.org/downloads/warrior3/archiveteam-warrior-v3.2-20210306.ova for ESXi 5.1 and up
–jeroen
Leave a Reply