Tooting in a container
Sun Jul 7 '19
I run an instance of Mastodon on a Fedora host. Building Mastodon is a bit of a pain, so I’ve made it worse by trying to build & run it under an Alpine Linux container with systemd-nspawn.
But, before I poorly document what I did to get it to work, I have a rant about what I had on my mind while I was doing this to myself.
Rant
A decade ago, I had a different attitude toward my machines than I do now. I was comfortable installing whatever packages or programs to build some one-off thing or run a service that I would use for a month or so until something broke or I got bored. None of it really mattered for long.
For the most part, I didn’t deal with the sort of problems that you solve if you want to compile or distribute anything for someone else to run. When writing software for myself, or for a single configuration, not a lot of thought goes into organization around source, build, runtime, configuration, and data. It all kinda blends into one thing–maybe more so with interpreted languages. In my personal experience, for what that’s worth, this is a common attitude. Especially in web development.[1]
For a time, it was (and may still be?) quite popular to include minified/bundled JavaScript or compiled/transpiled TypeScript/CoffeeScript with the source code in the VCS for software projects in those languages. Amazon scans public git repositories on GitHub for AWS secrets because people version control that information along side source code.
Treating your sources, build artefacts, and deployment secrets as separate things with different objectives in space & time is more complicated (more work) compared to treating them all the same. I’ve worked with people who, for that reason, don’t believe that they should think about these things.[2]
At a new job, I found we had a website in Node.js that served statc files
(among other things) over HTTP. They also used git to push updates with the
sources. I discovered you could git clone https://...
the website to copy
the entire thing and its history without any authentication.
The culture improvised and came up with their own designs and I remember people struggling to use “task runners”, like Bower or Grunt, as glorified shell scripts with flaky parallelism & incremental rebuilds thrown on after-the-fact. While build systems, even those as simple as ninja, fit the purpose well. And I’d be disregarded when advocating for these programs on the basis that they’re “antiquated”; as well as when advocating for Go for the opposite reason.
I can’t shake the idea that Docker is a similar thing
in that it allowed developers to go from
running their Node.js services as root in screen
without supervision
to running them as root in Docker supervised by Kubernetes
or something cloud like that.
And maybe that’s a good thing.
Maybe screen
was the best they could ever do.
And Docker is preferable to that
and cool enough that people are motivated to use it.
At some point, did we miss the part where sometimes developers and operators have different relationships to software? Or that maybe programs shouldn’t have the same relationships to its binaries & its state/data? I don’t know. Maybe I’m just afraid of change.
But I wonder if the momentum of modern software development has been set by people who, like me, were oblivious to software life-cycle doctrines in other ecosystems. Like how init systems, package management, and build systems were used to solve problems.
As someone who doesn’t need their containers to auto-scale, running a service with Docker has yet to be easier than running a service by installing a package. Techniques to decouple applications from their hosts, with things like static linking or containers, seem like the responsibility of system administration. Not something that developers should generally impose.
Also, I don’t hate containers. I think it’s cool that one can copy a container to another system and run it there. (Sort of like building packages on one machine and installing them on another.) I believe namespaces and isolation are useful and can be used to solve problems. I like that I can make an OpenSUSE container on my Fedora host to build RPMs for an OpenSUSE system without a chroot or a VM or whatever. Because, man, do I hate VMs.
The Container
This was done when Mastodon was at version 2.9.2.
Mastodon’s installation instructions tell you to install Node.js and npm and Yarn and stuff in order to build and run its three services, mastodon-sidekiq, mastodon-web, mastodon-streaming.
It seems odd to me, but it looks like they have you build a specific version of Ruby specially for Mastodon, instead of using the system Ruby.[3]
You’re linking this special Ruby against your system libraries though. So when you upgrade system packages and they are no longer ABI compatible, Mastodon breaks. So I’m not entirely sure what the point of that is.
Alternatively, they provide some instructions on how you can use docker-compose to run Mastodon and its dependent services, Redis & PostgreSQL. Now, I want to run those two services on my host. But I can’t even figure out how to get Docker to work with my init system properly, so I definitely don’t know how to convey dependencies between docker-compose containers and services running on my host. And figuring that out isn’t as interesting as all this.
These steps I did on a more powerful machine than the server that Mastodon will eventually run on.
First, I grabbed and unpacked the desired Alpine Linux rootfs
(see their downloads page) with machinectl
. This unpacked
the container’s rootfs at /var/lib/machines/mrtooty
.
machinectl pull-tar --verify=checksum \
http://dl-cdn.alpinelinux.org/alpine/v3.10/releases/x86_64/alpine-minirootfs-3.10.0-x86_64.tar.gz \
mrtooty
We use could run systemd-nspawn -M mrtooty
to get
a shell, but if we do, we’ll get some complaints about our tty and job control won’t
work. I don’t know why. Thankfully, this helpful comment on github tells us what
to do about it. So we can use the following to run the ash
shell in our container.
systemd-nspawn -M mrtooty -- /sbin/getty -nl /bin/ash 0 /dev/console
I prefer the fish
shell to ash
, so at this point I installed fish with apk add
fish
and started using fish. But you can do whatever you want.
Some of the instructions for Mastodon are specific to Ubuntu. But basically it tells you to install Node.js, npm, Yarn, and a bunch of packages with development headers so you can build Mastodon or Ruby or something. This is what I ended up installing in order to build Ruby and run things.
apk add imagemagick ffmpeg libpq postgresql-dev libxml2 \
libxslt file git g++ protobuf protobuf-dev pkgconf nodejs \
npm gcc autoconf bison yaml-dev readline-dev zlib-dev \
ncurses-dev libffi gdbm gdbm-dev yarn libidn-dev icu-dev \
openssl-dev bash make linux-headers gcompat su-exec
Two important additions above are:
gcompat
, which provides/lib/ld-linux-x86-64.so.2
. Otherwise themastodon-streaming
service will fail to start because the library is required bynode_modules/@clusterws/cws/dist/cws_linux_64.node
under the Mastodon sources.su-exec
which I use in the systemd service file to switch users.
At this point, I followed their instructions pretty closely. I made a user for the Mastodon services and started bash.
apk add shadow
useradd mastodon --system --create-home
su -s /bin/bash - mastodon
The rest of instructions for installing Ruby should pretty much work. Except I didn’t build with jemalloc because it’s not available in Alpine Linux; it’s not minimalist enough for musl or something.
After that, the instructions go on to talk about PostgreSQL which I’ll skip since I’m not covering that here.
The
“Setting up Mastodon”
section which checks out the Mastodon source code into ~/live
(still as the mastodon
user)
and running bundle
and yarn
is important and should just work.
We can quit out of our shell to stop the container. At this point, I’m moving the container over to the host that it will run on.
The Toot
This is where things get really stupid. But, by the time I had realized my mistake, so many compromises were made that it was too late to turn back.
The trick is to run the three mastodon services (mastodon-sidekiq, mastodon-web, mastodon-streaming) on a single rootfs.
If you try and start an nspawn-machine that is already running, it will tell you that the rootfs is busy. And apparently this is a good thing for a good reason.
We can scoot by this since Mastodon doesn’t need to modify the rootfs while our services are running. It just needs to modify our its home directory to some degree, at least for user uploaded attachments.
So here I’ve taken /home/mastodon
from the container out of the rootfs directory tree
and put it at /home/mastodon
on the host. Now I run systemd-nspawn
with a few extra
options which are documented online:
--register no
; Keeps the container from registering with machinectl. The machinectl commands won’t really work to control these services anyway since it requires they run systemd & dbus and they aren’t.--bind /var/run/postgresql
; I connect to PostgreSQL over a UNIX domain socket, so this makes that work. I’m pretty sure the UID in the container must match a UID on the host that is authenticated with PostgreSQL for this to work. (Installpostgresql-client
and runpsql --host /var/run/postgresql -d mastodon_production
in the container to test it out.)--bind /home/mastodon
; This will mount our install of Mastodon–readable and writable.--keep-unit
; Prevents systemd from creating a new scope for the container. Without this, only the first service will run and subsequent services will fail to create a mrtooty scope. I don’t really know what scopes are all about. So hopefully this option isn’t bad.--read-only
; Sets up the rootfs read only to the containers so that we can start three services in their own containers using the same rootfs.
My service files follow. But I’m pretty sure they don’t work right. Like I bet the pid
for each service is for systemd-nspawn
so ExecReload
sends SIGUSR1
to nspawn and
just takes it out.
mastodon-sidekiq.service
[Unit]
After=network.target
[Service]
Type=notify
ExecStart=systemd-nspawn -M mrtooty \
--register no \
--bind /var/run/postgresql \
--bind /home/mastodon \
--keep-unit \
--read-only \
-- /sbin/su-exec mastodon \
/usr/bin/env DB_POOL=5 RAILS_ENV=production \
bash -c 'cd /home/mastodon/live && /home/mastodon/.rbenv/shims/bundle exec sidekiq -c 5 -q default -q mailers -q pull -q push'
TimeoutSec=45
Restart=always
[Install]
WantedBy=multi-user.target
[Unit]
After=network.target
mastodon-streaming.service
[Service]
Type=notify
ExecStart=systemd-nspawn -M mrtooty \
--register no \
--bind /var/run/postgresql \
--bind /home/mastodon \
--keep-unit \
--read-only \
-- /sbin/su-exec mastodon \
/usr/bin/env PORT=4000 NODE_ENV=production \
bash -c 'cd /home/mastodon/live && /usr/bin/npm run start'
TimeoutSec=45
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
mastodon-web.service
[Unit]
After=network.target
[Service]
Type=notify
ExecStart=systemd-nspawn -M mrtooty \
--register no \
--bind /var/run/postgresql \
--bind /home/mastodon \
--keep-unit \
--read-only \
-- /sbin/su-exec mastodon \
/usr/bin/env PORT=3000 RAILS_ENV=production \
bash -c 'cd /home/mastodon/live && /home/mastodon/.rbenv/shims/bundle exec puma -C config/puma.rb'
ExecReload=/bin/kill -SIGUSR1 $MAINPID
TimeoutSec=45
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
Anyway, don’t do this. Like most things I do in life, I realized I was probably making a mistake but it was too late to turn back. This is a dumb hobby.