https://froghat.ca/froghat.ca2024-02-10T02:21:41.981005+00:00python-feedgenfroghat.cahttps://froghat.ca/2023/11/sourcehut-cloudflare-pagesContinuous Deployment to Cloudflare Pages from a SourceHut repository2023-11-21T12:00:00-08:00sqwishy<p>I made some JavaScript heavy webshit for viewing crafting recipes for the
computer game <a class="reference external" href="https://barotraumagame.com/">Barotrauma</a>. I wanted to host the static website somewhere that
wasn’t my own infrastructure; but I’m also very interested in not spending more
money than I need to right now.</p>
<p>I won’t over-excite you with every detail of every decision I made. Here’s a summary:</p>
<ul class="simple">
<li><p>GitHub has gotten really annoying to use in the last year. Code search
requires login; code viewer bloated and crummy on mobile; asking for an OTP
every other time I click a button; popups to review/update/confirm personal
information every other time I visit the site.</p></li>
<li><p>Building my project requires scanning game assets. Those assets aren’t public<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a> so I don’t want to copy them all over the internet. I want to
build and publish the site on my own infrastructure.</p></li>
<li><p>SourceHut Pages is actually great.</p></li>
</ul>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>Much of the the content I need is available in the game’s dedicated server
files, which is distributed on Steam without requiring having bought the
full game. Unfortunately, those files were missing visual things I use,
like item icons and sprites.</p>
</aside>
</aside>
</aside>
<aside class="aside">
<p>One day I might host my own projects but I looked into this for two seconds and
saw the drama between Gitea and Forgejo and completely lost interest.</p>
</aside>
<section id="sourcehut-pages">
<h2>SourceHut Pages<a class="self-link" title="link to this section" href="#sourcehut-pages"></a></h2>
<p>In my opinion, publishing to <a class="reference external" href="https://srht.site/">SourceHut Pages</a> is very easy.</p>
<pre class="code bash full-width literal-block"><code>tar<span class="w"> </span>zc<span class="w"> </span>index.html<span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>curl<span class="w"> </span>--oauth2-bearer<span class="w"> </span><span class="nv">$srhttoken</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>-Fcontent<span class="o">=</span>@/dev/stdin<span class="w"> </span><span class="se">\
</span><span class="w"> </span>https://pages.sr.ht/publish/sqwishy.srht.site</code></pre>
<p>I don’t have to install anything scrupulous with npm. It uses existing things I’m familiar with.</p>
<p>One irritation about SourceHut in general is that they have two
different things called Personal Access Tokens. One at <a class="reference external" href="https://meta.sr.ht/oauth">https://meta.sr.ht/oauth</a>;
another at <a class="reference external" href="https://meta.sr.ht/oauth2">https://meta.sr.ht/oauth2</a>. The former is <em>“legacy”</em>. But the
non-legacy API <em>still</em> lacks features that exist in the legacy API<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>Accessing or creating repository webhooks, for which there are no pages or
forms for on git.sr.ht, requires the legacy API and its tokens while
SourceHut Pages uses non-legacy tokens.</p>
</aside>
</aside>
</aside>
<p>They use different
credentials but both are called Personal Access Tokens.
They aren’t interchangeable and if you use the wrong Personal Access Token, you
don’t always get a clear error back.</p>
<hr class="docutils" />
<p>Another important consideration is that this little webshit I made is super useful and
everyone is going to love it for sure and it’s going to get loads of visits.
Probably top ten websites on the internet.</p>
<p>But the data and images are about 500 kB with compression. So, should I
put it on SourceHut Pages if the traffic is absolutely going to cause service
degradation for other users? I asked on IRC and the only reply was:</p>
<blockquote>
<p>If you have to ask about usage limits, that’s generally a good indicator that you might want something different.</p>
<footer class="attribution">—someone on #sr.ht</footer>
</blockquote>
<p>Though, that was exactly the feeling I had that compelled me into asking the question to
begin with. SourceHut Pages is free – and not free as in email.
Isn’t <em>asking</em> about my obligations part of being a responsible user of the service?
Or is it evidence that I don’t belong. <span class="nowrap">¯\_(ツ)_/¯</span></p>
<p>I didn’t say anything to them,
but it was like I had just walked into a restaurant and been handed a menu
with no prices. And, when asking about what the food costs, the server,
almost looking off and discretely screening themselves from
second-hand embarrassment, peers at me with judgement and explains that, if I
have to ask, then I’m at the wrong restaurant. And after a moment of silence for
my dignity, politely moves on as if, for my sake, I had never asked such a
fucking stupid question.</p>
</section><section id="cloudflare-pages">
<h2>Cloudflare Pages<a class="self-link" title="link to this section" href="#cloudflare-pages"></a></h2>
<p>To upload static assets to Cloudflare Pages, they advise using a JavaScript program
called <code>wrangler</code> (or manually upload zip files to a web form). It’s one of those
programs that vomits emojis in every line of output to keep you engaged.
It seems very engineered and I couldn’t find a simpler way using
curl so I conceded I’d run wrangler in a container and avoid letting it near
anything with an emoji allergy.</p>
</section><section id="sourcehut-webhooks">
<h2>SourceHut Webhooks<a class="self-link" title="link to this section" href="#sourcehut-webhooks"></a></h2>
<p>I’m hosting the source code for my webshit on git.sr.ht, which has
webhooks. The documentation for this is a little bit wild<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a>; there are two
main resources:</p>
<ul class="simple">
<li><p><a class="reference external" href="https://man.sr.ht/api-conventions.md#webhooks">Information about webhooks across SourceHut services generally.</a></p></li>
<li><p><a class="reference external" href="https://man.sr.ht/git.sr.ht/api.md#repoupdate">Information about the webhooks for git.sr.ht specifically.</a></p></li>
</ul>
<aside class="footnote-list superscript">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>One of the things they trip you up on is when they mention a username in
the endpoints. It’s <em>is in fact</em> your <em>canonical name</em> (your username with a
<code>~</code> in front). It is <em>not</em> the username you log in with <em>nor</em> the
value of the <code>name</code> field returned by https://meta.sr.ht/api/user/profile.</p>
</aside>
</aside>
<p>At each respective page linked above they describe these URL patterns:</p>
<ul class="simple">
<li><p><code>/api/.../webhooks</code></p></li>
<li><p><code>/api/:username/repos/:name/...</code></p></li>
</ul>
<p>Put them together and prefix your username with <code>~</code>. For example, this creates a
webhook for my repository named europan-materialist:</p>
<pre class="code shell full-width literal-block"><code>jo<span class="w"> </span><span class="nv">url</span><span class="o">=</span>https://butt.froghat.ca/europan-materialist<span class="w"> </span><span class="se">\
</span><span class="w"> </span>events<span class="o">[]=</span>repo:post-update<span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>curl<span class="w"> </span>--oauth2-bearer<span class="w"> </span><span class="nv">$srhttoken</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--json<span class="w"> </span>@-<span class="w"> </span><span class="se">\
</span><span class="w"> </span>https://git.sr.ht/api/~sqwishy/repos/europan-materialist/webhooks</code></pre>
<p>On success, the request’s response contains an object representing the webhook you created.</p>
<p>Or we can query the list of webhooks with:</p>
<pre class="code shell full-width literal-block"><code>$<span class="w"> </span>curl<span class="w"> </span>--oauth2-bearer<span class="w"> </span><span class="nv">$srhttoken</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>https://git.sr.ht/api/~sqwishy/repos/europan-materialist/webhooks</code></pre>
<p>… which replies …</p>
<pre class="code json full-width literal-block"><code><span class="p">{</span><span class="w">
</span><span class="nt">"next"</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span><span class="w">
</span><span class="nt">"results"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nt">"id"</span><span class="p">:</span><span class="w"> </span><span class="mi">28961</span><span class="p">,</span><span class="w">
</span><span class="nt">"created"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2023-11-17T13:50:06+00:00"</span><span class="p">,</span><span class="w">
</span><span class="nt">"events"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s2">"repo:post-update"</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="nt">"url"</span><span class="p">:</span><span class="w"> </span><span class="s2">"https://butt.froghat.ca/europan-materialist"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="nt">"total"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
</span><span class="nt">"results_per_page"</span><span class="p">:</span><span class="w"> </span><span class="mi">50</span><span class="w">
</span><span class="p">}</span></code></pre>
</section><section id="butt-froghat-ca">
<h2>butt.froghat.ca<a class="self-link" title="link to this section" href="#butt-froghat-ca"></a></h2>
<p>I’m using git.sr.ht to host my project but I want to mirror it to GitHub for
the social credit. So the webhooks at SourceHut POST to a Python script in
my <em>“infrastructure”</em> that does two things.</p>
<ul class="simple">
<li><p>Mirror the repository to GitHub.</p></li>
<li><p>Build and upload the site to Cloudflare Pages.</p></li>
</ul>
<p>This is broken down into three oneshot <span class="hl-purple">services</span> in systemd that the script
activates through a <span class="hl-yellow">target</span> in systemd.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="systemd-chart-dm.svg">
<img alt="systemd-chart.svg" src="systemd-chart.svg" />
</picture>
<p><span class="hl-purple">europan-materialist-fetch.service</span> updates the remote tracking
branches in a bare/mirror git repository. It is <code>RequiredBy</code> and <code>Before</code> both the mirror and deploy steps.</p>
<p><span class="hl-purple">europan-materialist-mirror.service</span> pushes that bare/mirror repository to GitHub.</p>
<p><span class="hl-purple">europan-materialist-deploy.service</span> updates its own checkout of the local mirror and builds and runs a container in podman that deploys to Cloudflare Pages.</p>
<p>The <span class="hl-yellow">europan-materialist-cd.target</span> is activated over dbus when the Python
script at butt.froghat.ca gets POSTed a webhook. This target <code>Requires</code> both
mirror and deploy steps so they are started when the target is activated.
And since the mirror and deploy steps have <code>Requires</code> and <code>After</code> to the fetch
step, systemd’s ordering ensures the local mirror at <code>/mirror</code> is updated successfully
before the steps that need it are run in parallel.</p>
<p>These are files.</p>
<pre class="code ini full-width literal-block"><code><span class="c1"># ~/.config/systemd/user/europan-materialist-cd.target</span><span class="w">
</span><span class="k">[Unit]</span><span class="w">
</span><span class="na">Requires</span><span class="o">=</span><span class="s">europan-materialist-deploy.service</span><span class="w">
</span><span class="na">Requires</span><span class="o">=</span><span class="s">europan-materialist-mirror.service</span><span class="w">
</span><span class="c1"># ~/.config/systemd/user/europan-materialist-deploy.service</span><span class="w">
</span><span class="k">[Unit]</span><span class="w">
</span><span class="na">Requires</span><span class="o">=</span><span class="s">europan-materialist-fetch.service</span><span class="w">
</span><span class="na">After</span><span class="o">=</span><span class="s">europan-materialist-fetch.service</span><span class="w">
</span><span class="k">[Service]</span><span class="w">
</span><span class="na">Type</span><span class="o">=</span><span class="s">oneshot</span><span class="w">
</span><span class="na">WorkingDirectory</span><span class="o">=</span><span class="s">%h/europan-materialist</span><span class="w">
</span><span class="na">ExecStartPre</span><span class="o">=</span><span class="s">/usr/bin/git fetch --prune</span><span class="w">
</span><span class="na">ExecStartPre</span><span class="o">=</span><span class="s">/usr/bin/git checkout FETCH_HEAD</span><span class="w">
</span><span class="na">ExecStartPre</span><span class="o">=</span><span class="s">/usr/bin/podman build -f Containerfile </span>\<span class="w">
</span><span class="s">-v %h/Barotrauma/Content:/Content:ro </span>\<span class="w">
</span><span class="s">--tag europan-materialist</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">/usr/bin/podman run </span>\<span class="w">
</span><span class="s">--rm </span>\<span class="w">
</span><span class="s">-v %h/cfkeys:/run/secrets/wrangler:ro </span>\<span class="w">
</span><span class="s">europan-materialist </span>\<span class="w">
</span><span class="s">ash -c 'npm x -- vite build && npm install wrangler && env $$(cat /run/secrets/wrangler) npm x -- wrangler pages deploy --project-name materialist-next ./dist'</span><span class="w">
</span><span class="c1"># ~/.config/systemd/user/europan-materialist-mirror.service</span><span class="w">
</span><span class="k">[Unit]</span><span class="w">
</span><span class="na">Requires</span><span class="o">=</span><span class="s">europan-materialist-fetch.service</span><span class="w">
</span><span class="na">After</span><span class="o">=</span><span class="s">europan-materialist-fetch.service</span><span class="w">
</span><span class="k">[Service]</span><span class="w">
</span><span class="na">Type</span><span class="o">=</span><span class="s">oneshot</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">/usr/bin/git -C /mnt/kaput/projects/europan-materialist.git push --mirror github</span><span class="w">
</span><span class="c1"># ~/.config/systemd/user/europan-materialist-fetch.service</span><span class="w">
</span><span class="k">[Service]</span><span class="w">
</span><span class="na">Type</span><span class="o">=</span><span class="s">oneshot</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">/usr/bin/git -C /mnt/kaput/projects/europan-materialist.git fetch --prune origin</span></code></pre>
<p>That’s it. It hasn’t fallen over yet. Last time it ran it finished in about
thirty seconds. Nearly a third of that is the emoji program. Here’s even some logs
from it as bonus.</p>
<pre class="code full-width literal-block"><code>Nov 20 13:04:36 banana podman[352347]: Uploading... (15/15)
Nov 20 13:04:36 banana podman[352347]: ✨ Success! Uploaded 3 files (12 already uploaded) (0.95 sec)
Nov 20 13:04:42 banana podman[352347]: ✨ Deployment complete! Take a peek over at https://3fccd3c7.materialist-next.pages.dev</code></pre>
<p>The website is served at <a class="reference external" href="https://materialist.pages.dev/">materialist.pages.dev</a>. The code is <a class="reference external" href="https://github.com/sqwishy/europan-materialist/">on GitHub</a> and <a class="reference external" href="https://git.sr.ht/~sqwishy/europan-materialist">SourceHut</a>.</p>
</section><p>Modernizing froghat.ca with reverse cloud at the edge using
<span class="docutils literal">socat <span class="pre">tcp-listen:8080</span> exec:systemctl start <span class="pre">important-things.service</span></span></p>
2023-11-21T12:00:00-08:00https://froghat.ca/2023/12/selinuxVPS migration & selinux resentment2023-12-22T12:00:00-08:00sqwishy<aside class="aside">
<p><a class="reference external" href="#selinux-rant">skip to selinux rant</a></p>
</aside>
<p>Earlier this year I went to renew my VPS at OVH. As well as a domain with Gandi.</p>
<p>The price to renew the domain with Gandi went up from $51.15 to $72.59 while it was in my cart. I asked them what that was about and they replied, paraphrasing, “ya we raised the price 🖕”. So instead I transferred my domain to porkbun.com for about $25.</p>
<hr class="docutils" />
<p>At OVH it was going to cost $194 to renew my VPS for one year. I’m not sure why, but they wouldn’t offer a discount for paying for an entire year upfront as they typically do. For $82, I switched to their cheapest AMD EPYC VPS. Same specs except less memory, which I didn’t need that much of anyway.</p>
<p>The service on that machine collects a lot of images and serves them on a website. Since, I didn’t get around to making a good way of removing old files to free up space for new ones, migrating it to a new machine was an opportunity for housekeeping.</p>
<p>This was as straightforward as you might expect:</p>
<ul class="simple">
<li><p>I started the service on the new machine to begin collecting data.</p></li>
<li><p>Exported data I wanted to retain from the old machine up until the timestamp of the first event on the new machine. (Timestamps were in messages from an external service that both connected to, rather than wall clock which could differ between machines.)</p></li>
<li><p>Imported that historical data into the SQLite database on the new machine and copied corresponding images on disk with <code>tar</code> and <code>ssh</code>.</p></li>
</ul>
<p>After testing the new site in my browser, I updated the DNS records with porkbun to the new host’s address.</p>
<p>Even though the DNS record had a ten minute expiry, the old server still had connections when it was shut down about a day later. Those clients reconnected to the new host pretty quickly on their own.</p>
<aside class="aside">
<p>A stacked graph showing the number of connections to the <span class="hl-yellow">old</span> and <span class="hl-purple">new</span> servers over time.</p>
</aside>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="switchy-dm.png">
<img alt="switchy-lm.png" src="switchy-lm.png" />
</picture>
<p>It all went pretty smoothly and, in the end, I got that nice chart out of it.</p>
<hr class="docutils" />
<p id="selinux-rant">Fedora has a running prank of installing selinux when you install Fedora. I disabled selinux on my new VPS using <code>setenforce</code>.</p>
<p>About a month after setting up the machine, I tried rebooting it to troubleshoot a networking issue. After a few minutes, the webserver did not come up and attempts to <code>ssh</code> into the machine timed out. Yet, the OVH dashboard reported that the machine was running.</p>
<p>I ship application and system logs and metrics using <a class="reference external" href="https://vector.dev/">vector</a> which are then stored in <a class="reference external" href="https://grafana.com/oss/loki/">loki</a>. By some miracle of science, vector was shipping logs during this time that the machine was not accepting ssh connections.</p>
<pre class="code full-width reflow literal-block"><code>[audit] AVC avc: denied { read } for pid=818 comm="sshd" name="sshd_config" dev="sda5" ino=974214 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=0
[sshd] /etc/ssh/sshd_config: Permission denied</code></pre>
<p>Evidently, selinux was keeping my system secure by preventing sshd from reading its own configuration file in order to start up. At that moment I realized I had only temporarily disabled selinux with <code>setenforce</code> and forgotten to disable it across reboots.</p>
<aside class="aside">
<p>I have to hand it to ‘em here, requiring users to disable selinux twice in two different ways, once for the current boot and another for subsequent boots, is really very clever. Good one, Fedora, you got me!</p>
</aside>
<p>I used a recovery boot thingy that OVH provides to boot my VPS. OVH emails me SSH login credentials. I log in and disable selinux persistently. Now the machine boots normally.</p>
<hr class="docutils" />
<p>What really got to me, is that I put a fair bit of my own time into reducing downtime in my service. I went so far as to use <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/252/sd_pid_notify_with_fds.html">sd_pid_notify_with_fds</a> and <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/252/sd_listen_fds.html">sd_listen_fds</a> to prevent missing messages broadcast by an external event source between service restarts. A number of times, the service upgraded without closing the socket it uses to read incoming events. Events would be buffered and read when the service comes back up; instead of being missed entirely while the connection was down.</p>
<p>It worked. And even though this was a side-project of no real significance to anyone, I felt happy that I had no unplanned downtime in just over two years of running it.</p>
<p>By the time I finished fumbling my way through the repair and the system came back online, it had been down for almost an hour. I understand that I forgot to disable selinux, and that selinux would prevent my service from starting because I didn’t write policies for it. That’s my fault, so that’s fair I guess. And nginx wouldn’t start either, as far as I know, because Fedora doesn’t ship a policy that lets it bind to port 80, whatever. But I don’t know why sshd would not be allowed to read its own configuration file at <code>/etc/ssh/sshd_config</code>. This makes no sense.</p>
<p>My first guess was, if you run with selinux disabled temporarily and you upgrade your system, then packages and files or services (like sshd) won’t be installed with selinux contexts. So when you reboot, and selinux is enabled again, it misbehaves because of missing contexts.</p>
<p>On the other hand, <code>setenforce 0</code> doesn’t actually <em>disable</em> selinux, according to the man page, it just sets it to permissive mode. And permissive mode is practically the same as selinux being enabled (in enforcing mode) in the sense that it runs and evaluates denials based on policies. The key difference being it doesn’t <em>act</em> on those denials and just reports them. As far as I know, you are meant to be able to go from enforcing mode to permissive mode and back without issue – for the purpose of troubleshooting or something. So if being in permissive mode messes up selinux contexts, that wouldn’t make sense.</p>
<p>Alternatively, it’s possible I copied <code>/etc/ssh/sshd_config</code> over from the old machine with <code>tar</code> and, by default, it doesn’t copy extended attributes or selinux context stuff. If I had done this, I would expect to see it in my shell history next to where I did this for a bunch of other files. But I don’t see that. And the file creation timestamp doesn’t suggest it was copied from the old machine either.</p>
<p>So I have no idea how this happened. I tried to ask about it on #fedora on irc.libera.chat. To be upfront, I was pretty snarky after wasting an hour of my Sunday afternoon on this piece of shit software. I was advised; to enable sshd.service, don’t ssh in as root, and that I screwed it up because it “just works”.</p>
<aside class="aside">
<p>Long ago, I had spent some time trying to use selinux properly. It might be better now, but most of the documentation was high level overviews. It was time consuming to sort through that junk to find actual solutions to problems, like what you needed to run to configure selinux or even what the relevant packages were in Fedora.</p>
<p>Some of the resources are just memes, like <a class="reference external" href="https://stopdisablingselinux.com/">stopdisablingselinux.com</a> and a <a class="reference external" href="https://people.redhat.com/duffy/selinux/selinux-coloring-book_A4-Stapled.pdf">colouring book</a> with illustrations that look like something out of a fever dream.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="wtf-dm.png">
<img alt="wtf-lm.png" src="wtf-lm.png" />
</picture>
<p>This isn’t some random thing, it’s the first item under <em>Additional resources</em> on the <a class="reference external" href="https://docs.fedoraproject.org/en-US/quick-docs/selinux-getting-started/#_additional_resources">Getting started with SELinux</a> page at docs.fedoraproject.org.</p>
<p>To be absolutely clear, <em>I am not trying to bully the illustrator here</em>. But whenever I’ve gone to look up documentation on selinux, it’s because selinux has interrupted something else I’d rather be doing. A colouring book could be cute in some contexts. selinux holding my system hostage is not one of them.</p>
</aside>
<p>I clarified I wasn’t asking whether or not I “screwed it up” but <em>how</em> – so I can not do that in the future. I was pretty sarcastic and, predictably, the conversation in this internet chat room was pointless as nobody learned anything and everyone was rude to everyone else.</p>
<p>In the end, I am surprised that sshd was not allowed to read its own configuration file. And this doesn’t inspire confidence that selinux won’t produce further surprises. How is an selinux enjoyer meant to minimize the surprises that might prevent their machine from booting?</p>
<p>Presumably you can set the system to permissive, restart a service, and check the audit logs for denials that would have prevented it from starting in enforcing mode. Then do that for NetworkManager, systemd-networkd, systemd-resolved, or whatever other services you can think of that are important that selinux could possibly interfere with?</p>
<p>But if your system is undergoing change as part of a reboot, as with <code>dnf system-upgrade</code>, will changes in that upgrade invite the wrath of the machine spirit? I have no idea. Do you? <em>(If so, please elaborate in the comments section below – the engagement helps promote my content on this advertising platform!)</em></p>
<p>If Fedora shipped selinux permissive by default and allowed me to opt-in specific services to run under enforcing mode, that would be interesting to me. At least that way, I could voluntarily waste my time with it on my own terms instead of on a Sunday afternoon when I’d rather be trying to unfuck whatever ufw did to my nftables.</p>
<p>As far as I can tell, the security selinux provides is job security for system administrators using it to create fragile complex systems that require specialized knowledge to maintain.</p>
<figure class="full-bleed figure">
<img alt="sleepyfox.jpg" src="sleepyfox.jpg" />
<figcaption>slepey fox – <a class="reference external" href="https://www.flickr.com/photos/wcdumonts/40602952543">photographed by Mark Dumont</a> – did you know this fox has never heard of selinux? look how soundly he snoozes <span class="nowrap">🦊💤</span></figcaption>
</figure>
<p>In conclusion, selinux is a great addition to any Linux based operating system. My VPS was never safer or more secure than on the day that none of its services were running.</p>
<p>Rant about the one time I forgot to disable selinux and it prevented sshd from starting.</p>
2023-12-22T12:00:00-08:00https://froghat.ca/2023/03/github-runner-cacheCaching with LVM snapshots for ephemeral self hosted GitHub runners2023-03-16T12:00:00-07:00sqwishy<picture>
<source media="(prefers-color-scheme: dark)" srcset="container1dm.svg">
<img alt="container1lm.svg" src="container1lm.svg" />
</picture>
<p>GitHub Actions (Uber for git hooks) is a modern CI/CD offering by GitHub (Uber
for broken links and 404 pages). YAML documents in git repositories
on GitHub serve as scripts for <em>runners</em>, allowing you to dispense chores for
GitHub’s VM when some event, like a commit, happens in your GitHub
project. GitHub has their own runners in the cloud that you can use.</p>
<p>GitHub hosted runners do not persist state between jobs. This is cool because you
can expect jobs to start from a consistent and predictable state. But it also
can make jobs take longer if they can’t re-use work from previous
runs; like incremental rebuilds or downloading and compiling dependencies that
change infrequently.</p>
<p>GitHub Actions features something they call a <a class="reference external" href="https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#using-the-cache-action">cache action</a> which is a way to
copy state between jobs by roughly doing <code>tar c job-cache | curl --form
"file=@-" ...</code>. But the cache action does not work well for building container
images with Podman.</p>
<p>I wanted to try sharing state between jobs by binding a volume from the host to
the runner. So wrote a short Python script to do that.</p>
<p>It uses <abbr title="Logical Volume Manager">LVM</abbr> to make snapshots for different
jobs and it’s the <em>fun</em> part of this post. But to write that, I had to
go through the anguish of running the GitHub runner myself, ensuring it does
not persist state between jobs.</p>
<p>This is one of my longer and more meandering blog posts. So here’s an improvised
table of contents – though you should read the whole thing because there will
be a quiz at the end.</p>
<ul class="simple">
<li><p>This post explains a bit about GitHub’s software for <a class="reference internal" href="#self-hosted-runners">self hosted runners</a></p></li>
<li><p>pre and post job <a class="reference internal" href="#hooks">hooks</a></p></li>
<li><p>using runners with <a class="reference internal" href="#ephemeral">ephemeral</a> to only do one job</p></li>
<li><p>that the <em>cache action</em> on GitHub is slow for <a class="reference internal" href="#caching-podman">caching Podman</a> storage</p></li>
<li><p>how the <a class="reference internal" href="#lvm-cache-friend">lvm-cache-friend</a> script communicates with a workflow and uses LVM to label cache volumes like the “key” in the <em>cache action</em></p></li>
<li><p>how I’m <a class="reference internal" href="#self-hosting-github-s-runner-in-systemd-nspawn">self hosting GitHub’s runner in systemd-nspawn</a> to ensure a consistent state at the start of each workflow job</p></li>
<li><p>how to prepare the first/default cache volume in <a class="reference internal" href="#lvm">LVM</a></p></li>
<li><p>and the <a class="reference internal" href="#lvm-cache-friend-service">lvm-cache-friend.service</a> file for running the script as a service in systemd.</p></li>
</ul>
<aside class="aside">
<p><em>Estimated Reading Time:</em> <span class="smolcaps">six</span> years <span class="smolcaps">four</span> months <span class="smolcaps">nine</span> days <span class="smolcaps">thirteen</span> hours <span class="smolcaps">two</span> minutes <span class="smolcaps">eighty one</span> seconds</p>
</aside>
<section id="self-hosted-runners">
<h2>self hosted runners<a class="self-link" title="link to this section" href="#self-hosted-runners"></a></h2>
<p>As an alternative to <em>GitHub hosted</em> runners, you can run workflows on <em>self hosted</em>
runners. (Like some sort of private cloud.)
But, these self hosted runners don’t work at all like GitHub’s runners. To
be fair, <a class="reference external" href="https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#differences-between-github-hosted-and-self-hosted-runners">they do tell you this</a>. But I don’t think it prepares
you for the cursed architecture that awaits.</p>
<p>As far as I can tell, the GitHub runner doesn’t separate application,
configuration, and persistent or runtime state.
By default, it sets itself up to automatically update itself. And it runs
workflows as the service’s user without dropping any privileges. So programs run
from the YAML workflows have write access to the runner that runs them. And that
seems like pretty not great hygiene to me. Even ignoring security concerns, this
seems like an unnecessary risk if the workflow does something stupid by mistake.</p>
<aside class="aside">
<p>I can trust myself to not write workflows to do bad things on purpose.
I can’t trust my workflows to not do bad things if I accidentally made them
dumbly.</p>
</aside>
<p>It makes much more sense if workflow execution was done by running a separate
program that sets up its environment depending on user requirements. Like how
<code>sudo</code>, <code>env</code>, <code>strace</code>, or <code>unshare</code> take another command to run as an
argument. Those programs allow users to run other programs under different
environments by setting those environments up to the users liking and then
starting the other program in that new environment.
If workflows were run by creating a new process, it should be easy to allow
users to configure the environment and privileges that a workflow runs in by
starting its process under <code>sudo</code> or <a class="reference external" href="https://github.com/containers/bubblewrap">bubblewrap</a> or even a Docker/Podman
container probably.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="container2dm.svg">
<img alt="container2lm.svg" src="container2lm.svg" />
</picture>
<p>I looked into a couple ways of cleaning this up.
Using <em>hooks</em> to run a script before and after a job.
And configuring the runner with <code>--ephemeral</code>.</p>
</section><section id="hooks">
<h2>hooks<a class="self-link" title="link to this section" href="#hooks"></a></h2>
<p>Runners can be configured to <a class="reference external" href="https://docs.github.com/en/actions/hosting-your-own-runners/running-scripts-before-or-after-a-job#triggering-the-scripts">run scripts before and after a job</a>. So
you could possibly change the environment before a job and then restore it
after. But this is this doesn’t work the same as programs doing <a class="reference external" href="https://en.wikipedia.org/wiki/Fork%E2%80%93exec">fork-exec</a>.</p>
<p>Suppose you wanted jobs to run in a network namespace. My understanding is, it’s
typical to <code>fork</code> a process, then the child <code>unshare</code>s or something to make a
new namespace, the parent sets up the namespace from the outside, finally
the child continues in the prepared namespace and will <code>exec</code>s the job or <em>some program</em>.
This is is how bubblewrap, and programs like it, allow you to run <em>some program</em>
of your choosing in a namespace to your liking.</p>
<p>In this case, maybe we want to modify the runner so that it runs in an
environment to our liking. But we can’t <code>exec</code> the runner; it’s already
running and it’s calling our hooks.</p>
<p>One idea, that I haven’t tried, is using <a class="reference external" href="https://www.man7.org/linux/man-pages/man2/setns.2.html">setns</a> to change the namespace of the
runner’s process. We can’t just call that in the hook, because it will
just change the namespace of the hook. We need the runner to make that call from
its process. To my knowledge, it’s possible to attach to a running process with
a debugger, like gdb, and use the debugger to make calls in the debugged
process. So maybe you can change another process’s namespace with something
like:</p>
<pre class="code full-width literal-block"><code>$ gdb -p 123
(gdb) call (int)pidfd_open(456, 0)
(gdb) call (int)setns($, 0)</code></pre>
<p>But this is a silly thing to do and even if it worked it would likely have a
bunch of wacky consequences. I just mention it because instead of saying it’s
not possible to get the isolation desired, I’d prefer to just say that I can’t
come up with any ways that aren’t absurd.</p>
</section><section id="ephemeral">
<h2><code>--ephemeral</code><a class="self-link" title="link to this section" href="#ephemeral"></a></h2>
<p>To install the GitHub runner software, GitHub provides you with a link to a
tarbomb you can download and extract all over your current directory to get
some scripts used to configure and start the program.</p>
<p>The <code>config.sh</code> script is intended to configure a new runner; requiring a runner
name, token, and repository to run workflows for.</p>
<p>There are a few more options like <code>--disableupdate</code>, which prevents the runner
from updating itself. And <code>--ephemeral</code> which, <a class="reference internal" href="#ephemeral">as GitHub’s documentation
explains</a>, configures the runner to just do one job.</p>
<p>These options let us limit the ambitions of the runner so that we can try
starting it in an already limited environment.</p>
<aside class="aside">
<p>Before noticing <code>--ephemeral</code>, I tried a hook to restart the
runner after a job. The restart must be scheduled so the hook can exit
without error (so the job is considered complete) but before a new job is
started. That timing is very sketchy for workflows with multiple jobs.
Otherwise, it might be a pretty cool hack.</p>
</aside>
</section><section id="caching-podman">
<h2>caching Podman<a class="self-link" title="link to this section" href="#caching-podman"></a></h2>
<p>Podman’s image storage does not cache well using GitHub’s <em>cache action</em>.</p>
<p>The first problem is that some files, although owned by the current user, have
no readable bits set. When the <em>cache action</em> runs <code>tar</code> it will fail to read
the file. These bits are fine for Podman, and you can even read them under
<code>podman unshare</code>, but the action doesn’t know this and can’t be configured to do
this.</p>
<p>I had a stupidly hard time trying to work around this. By far <a class="reference external" href="https://github.com/Atry/hhvm/commit/b05e533aab1bee7f8cad01e2a62e3c4bcd1e3f37">the best
approach I’ve seen</a> sets the suid bit on <code>tar</code> so it runs with elevated
permissions when the script for the <em>cache action</em> runs it.</p>
<p>The second problem is that caching this is slow. I’m not sure if this is
because of how Podman makes layers or if it’s because <code>node_modules</code> is lots of
tiny files. But creating or extracting with <code>tar</code> is too slow for it to be
worth it.</p>
<p>Not to say it was a problem with <code>tar</code>; as I also tried <a class="reference external" href="https://gist.github.com/sqwishy/18bc8a0eaa672734b549d4553e1a6476">copying layers</a> using
<a class="reference external" href="https://github.com/containers/skopeo/">skopeo</a> and that also took about as long building the images without a cache.</p>
<p>I’m starting to think that once a JavaScript project has tens of direct
dependencies, it’ll have thousands of implicit dependencies, and hundreds of
thousands of files. (This is at least the case with the two recent examples I’ve
had experience with.) And doing anything with all of them is not gonna be real
super speedy.</p>
</section><section id="lvm-cache-friend">
<h2>lvm-cache-friend<a class="self-link" title="link to this section" href="#lvm-cache-friend"></a></h2>
<p>This workflow is running in a system with a mount namespace and where changes are
not persisted between jobs because they’re being written to a tmpfs that gets
thrown out after the job finishes.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>*<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">*</a><span class="fn-bracket">]</span></span>
<p>Details about how <em>I</em> set that up using an <code>--ephemeral</code> GitHub runner
are later on in this post.</p>
</aside>
</aside>
</aside>
<p>A straightforward way to persist state, when a system is running on a tmpfs, is
to bind mount something from the host that will outlive the tmpfs. Like volumes
for containers.</p>
<p>The <em>cache action</em> on GitHub has an extra little feature where caches are saved
under a <em>key</em> and can be looked up later with a kind of glob match. This way,
multiple caches can exist with different keys. Instead of having just one cache
shared between each run like with a simple bind mount to one place on the host.</p>
<p>This is an example from the GitHub documentation:</p>
<blockquote>
<pre class="code yaml literal-block"><code><span class="nt">restore-keys</span><span class="p">:</span><span class="w"> </span><span class="p-Indicator">|</span><span class="w">
</span><span class="no">npm-feature-d5ea0750</span><span class="w">
</span><span class="no">npm-feature-</span><span class="w">
</span><span class="no">npm-</span></code></pre>
<p>The restore key <code>npm-feature-</code> matches any key that starts with the string
<code>npm-feature-</code>. For example, both of the keys <code>npm-feature-fd3052de</code> and
<code>npm-feature-a9b253ff</code> match the restore key. The cache with the most recent
creation date would be used. The keys in this example are searched in the
following order:</p>
<ol class="arabic simple">
<li><p><code>npm-feature-d5ea0750</code> matches a specific hash.</p></li>
<li><p><code>npm-feature-</code> matches cache keys prefixed with <code>npm-feature-</code>.</p></li>
<li><p><code>npm-</code> matches any keys prefixed with <code>npm-</code>.</p></li>
</ol>
<footer class="attribution">—<a class="reference external" href="https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#example-using-multiple-restore-keys">Example using multiple restore keys</a></footer>
</blockquote>
<p>For this off brand cache, I want a similar thing. This “cache” is just a bind
mount from the host. But we can mount different paths from the host and even
choose the right one using something like those “keys” requested in the workflow.</p>
<p>One detail of the <em>cache action</em>, GitHub explains, is that
caches are implicitly scoped to branches.</p>
<blockquote>
<p>Workflow runs can restore caches created in either the current branch or the default branch (usually <code>main</code>). If a workflow run is triggered for a pull request, it can also restore caches created in the base branch, including base branches of forked repositories.</p>
<footer class="attribution">—<a class="reference external" href="https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache">Restrictions for accessing a cache</a></footer>
</blockquote>
<p>I didn’t implement that because it seems complicated. Like the key used in the
workflow is not a <em>single source of truth</em> about what cache is used.
It sounds useful to look for caches in relevant branches, but I can’t think of a
good reason to not make that explicit in the keys.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="container16dm.svg">
<img alt="container16lm.svg" src="container16lm.svg" />
</picture>
<p>So we want the runner to be able to annotate caches – with its git branch name
or a hash of a <code>.lock</code> file or something – and look for caches with similar
annotations so it can use them if they’re there.<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>†<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">†</a><span class="fn-bracket">]</span></span>
<p>Originally, I intended having a separate load/mount and save/unmount
step where the annotations were written only when saving. But the
implementation is a bit simpler if they both happen at the same time.
Like when the <em>cache action</em> is used and <code>key</code> and <code>restore-keys</code> are
provided at the same time.</p>
</aside>
</aside>
</aside>
<p>And it’d be nice if the usage in the workflow was similar to the <em>cache
action</em>. So it’s simple and familiar. The information it mostly needs is:</p>
<ul class="simple">
<li><p>some criteria to look up what cache to use</p></li>
<li><p>the path to load the cache in the runner</p></li>
<li><p>and the annotations for the new cache being saved.</p></li>
</ul>
<p>The service uses thin LVM snapshots because I wanted to try them and see how
they worked and because it seems like a good way to a kind of de-duplication at
the block layer or something.</p>
<p>When the workflow asks for a cache, we look for a logical volume that best
matches the annotations that the workflow requested, we snapshot it to make a new
logical volume, and mount that volume into the runner.</p>
<p>In LVM, logical volumes can have tags. We can shove our annotations into those
tags to use LVM as a kind of database instead of keeping our own state.<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>‡<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">‡</a><span class="fn-bracket">]</span></span>
<p>Tags have some character restrictions, but fewer restrictions than volume
names. So it seems like an <em>okay</em> place to store user input.</p>
</aside>
</aside>
</aside>
<p>Here’s an example where annotations from the workflow are namespaced with
“friend:cache:” prefix.</p>
<pre class="code shell full-width literal-block"><code>$<span class="w"> </span>lvs<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--sort<span class="w"> </span>-lv_time<span class="w"> </span><span class="se">\
</span><span class="w"> </span>-o<span class="w"> </span>vg_name,lv_name,tags,origin<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--select<span class="w"> </span><span class="nv">lv_tags</span><span class="o">=</span>friend:snapshot<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--reportformat<span class="o">=</span>json<span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">'.report[].lv | .[].lv_tags |= split(",")'</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"vg_name"</span>:<span class="w"> </span><span class="s2">"banana"</span>,<span class="w">
</span><span class="s2">"lv_name"</span>:<span class="w"> </span><span class="s2">"friend-Y_v--tIL"</span>,<span class="w">
</span><span class="s2">"lv_tags"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="s2">"friend:cache:dc5e8c5be193d4952bceae567213b9007f5a9fa5bfd85c1dd9a8a4a6cb759422"</span>,<span class="w">
</span><span class="s2">"friend:cache:linux"</span>,<span class="w">
</span><span class="s2">"friend:snapshot"</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="s2">"origin"</span>:<span class="w"> </span><span class="s2">"friend-Y_v8SbU5"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"vg_name"</span>:<span class="w"> </span><span class="s2">"banana"</span>,<span class="w">
</span><span class="s2">"lv_name"</span>:<span class="w"> </span><span class="s2">"friend-Y_v8SbU5"</span>,<span class="w">
</span><span class="s2">"lv_tags"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="s2">"friend:cache:dc5e8c5be193d4952bceae567213b9007f5a9fa5bfd85c1dd9a8a4a6cb759422"</span>,<span class="w">
</span><span class="s2">"friend:cache:linux"</span>,<span class="w">
</span><span class="s2">"friend:snapshot"</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="s2">"origin"</span>:<span class="w"> </span><span class="s2">"friend-Y_v7HD25"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"vg_name"</span>:<span class="w"> </span><span class="s2">"banana"</span>,<span class="w">
</span><span class="s2">"lv_name"</span>:<span class="w"> </span><span class="s2">"friend-Y_v7HD25"</span>,<span class="w">
</span><span class="s2">"lv_tags"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="s2">"friend:cache:dc5e8c5be193d4952bceae567213b9007f5a9fa5bfd85c1dd9a8a4a6cb759422"</span>,<span class="w">
</span><span class="s2">"friend:cache:linux"</span>,<span class="w">
</span><span class="s2">"friend:snapshot"</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="s2">"origin"</span>:<span class="w"> </span><span class="s2">"friend-default"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span></code></pre>
<aside class="aside">
<p><a class="reference external" href="https://github.com/lvmteam/lvm2/blob/v2_03_17/WHATS_NEW">A new version of LVM</a> was released last year with a new output format
<em>json_std</em> that improves over <em>json</em> in a few ways, including outputting a
list of strings for <code>lv_tags</code> instead of a big string with comma-delimited
tags. (Above, <code>jq</code> is used split the value for readability.)</p>
<p>But I don’t have that version yet.</p>
</aside>
<p>Though, that’s kind of a crummy example because the tags for volume each are the
same.</p>
<p>Anyway, because the annotations are stored in a list, it seemed easier to
support looking up caches based on exact matches of multiple terms in any order
rather than a prefix match that GitHub’s action uses with its “key” thing.</p>
<p>Earlier in this post, there’s an example from the GitHub documentation using the
restore key <code>npm-feature-</code>. Instead of looking for snapshots prefixed with
<code>npm-feature-</code>, we would instead write <code>npm feature</code> to mean two tags (“npm” and
“feature”) that are both required and can occur in any order. I think it’s a bit
simpler and maps nicer to the LVM tags; but it’s a difference from the syntax
that the <em>cache action</em> uses.</p>
<p>Here’s an example of asking for a new snapshot.</p>
<pre class="code shell full-width literal-block"><code>ncat<span class="w"> </span>-U<span class="w"> </span>/run/lvm-cache-friend/socket<span class="w"> </span><span class="s"><<EOF
mount /home/ghrunner/cache
> linux dc5e8c
> linux
< linux dc5e8c
EOF</span></code></pre>
<p>The first two lines say to use (and snapshot) an existing volume that has <em>both</em> tags
“linux” and “dc5e8c” or just “linux”. Those criteria are evaluated in order
and, once one matches an existing volume, all subsequent criteria are ignored.</p>
<p>The last line just specifies what annotations to give to the new snapshot.
Annotations are individual terms delimited with whitespace and order doesn’t
matter. Again, this is a bit different from the GitHub action which does the sort of
prefix match thing.</p>
<p>Here’s a usage example in a GitHub workflow as a step in a job. This includes
the current branch name to replicate <em>some</em> of the implicit branch name stuff that the <em>cache action</em> uses.</p>
<pre class="code shell full-width literal-block"><code>-<span class="w"> </span>run:<span class="w"> </span><span class="p">|</span><span class="w">
</span>ncat<span class="w"> </span>-U<span class="w"> </span>/run/lvm-cache-friend/socket<span class="w"> </span><span class="s"><<EOF
mount /home/ghrunner/cache
> linux ${{ hashFiles('Cargo.lock','Cargo.toml') }}
> linux ${{ github.ref_name }}
> linux
< linux ${{ github.ref_name }} ${{ hashFiles('Cargo.lock','Cargo.toml') }}
EOF</span></code></pre>
<p>ncat turns out to be a nice program for this because, by default, it will wait
for the server to disconnect before exiting. The server will create a snapshot
and mount it before terminating the connection. This way, and the job does not
continue until the mount has been set up (or at least attempted). Although, the
success code is not sent so the job will try to continue even if the mount was
not successful for some reason.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="container63dm.svg">
<img alt="container63lm.svg" src="container63lm.svg" />
</picture>
<p>Mounting the snapshot into the runner’s mount namespace turned out to be kind of
a thing. Mostly, I just copied <a class="reference external" href="mount_in_namespace">how systemd does it</a> for its
<code>machinectl bind</code> command.</p>
<p>There’s a silly detail where, in the runner, we want to move the mount to the
location of the workflow’s choosing, but we can only move the mount if it’s under
a private mount, not a shared mount. If we try, we might see an error saying
something like: “moving a mount residing under a shared mount is unsupported”.
But the host can’t mount volumes into a private mount in the runner; I guess because
it’s private. So the host needs to mount the volume inside a shared mount and
then the runner makes the shared mount private and moves the volume.</p>
<p>Also, the host does this each time the workflow gets a cache. But I don’t think
it can reshare/unprivate the private mount, so it makes a new shared mount each
time. And that new shared mount has to be in an existing pre-arranged bind mount
in the runner. The directory containing the unix socket for making requests
is already bind mounted into the runner, so it can just use that and make
those temporary shared mounts in there next to the socket.</p>
<p>It’s probably worth noting that we only need to do this because workflows can
request caches to be mounted at arbitrary locations. And this is kind of a
sketchy idea anyway because it lets the runner mount something to <em>anywhere</em> in
its machine even if its user doesn’t otherwise have permission. If the caches were mounted to just one pre-arranged location under
a shared mount, we wouldn’t need to move them and it’d be a bit simpler and
safer. But it’s a good thing I didn’t do that, otherwise you would have missed
out on reading these three paragraphs of an obscure mount behaviour that will
never matter or be useful to you in the future.</p>
</section><section id="self-hosting-github-s-runner-in-systemd-nspawn">
<h2>self hosting GitHub’s runner in systemd-nspawn<a class="self-link" title="link to this section" href="#self-hosting-github-s-runner-in-systemd-nspawn"></a></h2>
<p>I ended up using systemd-nspawn to start a <em>machine</em> that runs the GitHub
runner as a service. systemd-nspawn lets us
have network and user namespaces so that GitHub’s software is relatively
isolated. And it can be configured so workflows can build and run container
images with Podman.</p>
<aside class="aside">
<p>Generally what systemd-nspawn runs are machines rather than containers
because it boots them and they have an init system and can run several
services under them.</p>
</aside>
<p>I tried to just run the runner under Podman (instead of systemd-nspawn)
but had a lot of issues with user namespaces and getting Podman to run as part
of workflows under Podman running the GitHub runner. There is <a class="reference external" href="https://quay.io/repository/containers/podman">a container
image for Podman</a> that demonstrates how to run Podman under Podman but it
seemed overall way more annoying to troubleshoot than systemd-nspawn.</p>
<p>This is kind of self-inflicted because I wanted to run the GitHub runner in a
user namespace where root on the runner is an unprivileged user on the host.
This would have been simpler if I hadn’t wanted that but I was stubborn over
the thought that having nested user namespaces really shouldn’t be that hard.</p>
<p>The systemd-nspawn machine running the GitHub runner uses some options to
prevent persisting state between reboots. By shutting down and restarting the
machine after the runner does one job, we can ensure that jobs are run with
clean and predictable state.</p>
<p>The service file for the GitHub runner service in the machine is part of how all
this works.</p>
<pre class="code ini full-width literal-block"><code><span class="k">[Unit]</span><span class="w">
</span><span class="na">...</span><span class="w">
</span><span class="na">OnSuccess</span><span class="o">=</span><span class="s">poweroff.target</span><span class="w">
</span><span class="k">[Service]</span><span class="w">
</span><span class="na">...</span><span class="w">
</span><span class="na">ExecStartPre</span><span class="o">=</span><span class="s">/home/ghrunner/get-token-and-config.sh </span>\<span class="w">
</span><span class="s">--unattended </span>\<span class="w">
</span><span class="s">--disableupdate </span>\<span class="w">
</span><span class="s">--ephemeral </span>\<span class="w">
</span><span class="s">--name dingus </span>\<span class="w">
</span><span class="s">--replace</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">/home/ghrunner/bin/runsvc.sh</span><span class="w">
</span><span class="na">...</span></code></pre>
<p>Before running a GitHub runner,
you’ll need run a <code>config.sh</code> script to configure it
with a runner token and the repository to get workflows from.
We’re doing that in <code>ExecStartPre</code> via a
helper script called <code>get-token-and-config.sh</code> which uses <a class="reference external" href="https://docs.github.com/en/rest/actions/self-hosted-runners#create-a-registration-token-for-a-repository">GitHub’s API to
get a runner token</a> and then call <code>config.sh</code> passing along the other arguments
given, like <code>--ephemeral</code> in this case, ensuring the service only runs one job
before stopping.</p>
<p>The bearer authorization for GitHub’s API (and the the repository URL) is
hard-coded; written directly into that <a class="reference external" href="https://github.com/sqwishy/graph-do-smell/blob/shitpost/misc/get-token-and-config.sh">helper script</a> and readable by workflows.
It’s not great. But this <em>proof of concept</em> already ended up being way more
effort than I intended, so I’m perfectly content with it as it is and saying
that this part that needs improvement is left <em>an exercise for the reader</em>.</p>
<p><code>ExecStart</code> runs the runner normally using a script from GitHub’s tarbomb.
Since we’re using <code>--ephemeral</code>, <code>runsvc.sh</code> quits after one job is run.
Then, <code>OnSuccess</code> ensures that when the service succeeds – when <code>runsvc.sh</code>
quits after a job is done – the machine shuts down and any temporary state is
released.</p>
<p>On the host, we can configure this machine to restart when it shuts down by
adding an override for its <code>systemd-nspawn@.service</code> configuration described
below. The reason the machine uses poweroff target instead of the reboot target
for the <code>OnSuccess</code> is that <code>reboot.target</code> seemed prevent the system from
stopping at all. Every time it would go to stop the GitHub runner service, in
order to shut down the system, the <code>OnSuccess</code> would be evaluated and it would start to
reboot instead. Kinda funny when you think about it. Anyway, doing the restart
in the service on the host is probably <em>cleaner</em> anyway in terms of a
<em>separation of concerns</em> or whatever.</p>
<p>I used a Dockerfile and <code>podman build</code> to prepare the rootfs for the
systemd-nspawn machine. You can <a class="reference external" href="https://github.com/sqwishy/graph-do-smell/blob/shitpost/misc/Dockerfile">view the Dockerfile on GitHub</a>.</p>
<ol class="arabic">
<li><p>Build the image.</p>
<pre class="code shell full-width literal-block"><code>podman<span class="w"> </span>build<span class="w"> </span>.<span class="w"> </span>--tag<span class="w"> </span>github-runner</code></pre>
</li>
<li><p>Create a container from the image and pipe an export of the rootfs to
machinectl.<a class="footnote-reference superscript" href="#footnote-4" id="footnote-reference-4" role="doc-noteref"><span class="fn-bracket">[</span>§<span class="fn-bracket">]</span></a></p>
<aside class="footnote-list superscript">
<aside class="footnote superscript aside" id="footnote-4" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-4">§</a><span class="fn-bracket">]</span></span>
<p>In these examples, the machine name is github-runner. (If I told you
what I really named my machine, they would both of us away for a long time.)</p>
</aside>
</aside>
<pre class="code shell full-width literal-block"><code>podman<span class="w"> </span><span class="nb">export</span><span class="w"> </span><span class="k">$(</span>podman<span class="w"> </span>create<span class="w"> </span>github-runner<span class="k">)</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>machinectl<span class="w"> </span>import-tar<span class="w"> </span>-<span class="w"> </span>github-runner</code></pre>
<p><code>machinectl</code> should extract the rootfs to <code>/var/lib/machines/github-runner</code>.</p>
<p>You <em>ought</em> to be able to create a tar from a Dockerfile without <code>podman
create</code> just doing <code>podman build --output type=tar</code>, but it <em>seems</em> like
<a class="reference external" href="https://github.com/containers/buildah/issues/4463">there’s a bug where suid bits aren’t set</a> and this messes up some
things like <code>newuidmap</code>.</p>
</li>
<li><p>The systemd-nspawn machine runs in a user namespace where root in the
machine maps to an unprivileged user on the host.</p>
<p>For this machine, I’ve used <code>--private-users-ownership=chown</code> to modify
ownership of the rootfs use a new range of unprivileged users. This preserves
ownership relatively within the machine. Whatever uid N is mapped on the host
to root in the machine, the first user in the machine (with uid 1000) will
have its files owned by N + 1000 on the host.</p>
<pre class="code shell full-width literal-block"><code>systemd-nspawn<span class="w"> </span>-M<span class="w"> </span>github-runner<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--private-users-ownership<span class="o">=</span>chown<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--private-users<span class="o">=</span>pick<span class="w"> </span><span class="se">\
</span><span class="w"> </span>/bin/true</code></pre>
<p>This lets systemd-nspawn choose a uid itself. Or you can put your own value
in there explicitly. It should output something like:</p>
<pre class="code full-width literal-block"><code>Selected user namespace base 570097664 and range 65536.</code></pre>
<p>Now, <code>/var/lib/machines/github-runner</code> should be owned by
<code>570097664:570097664</code> and <code>/var/lib/machines/github-runner/home/ghrunner</code>
owned by <code>570098664:570098664</code>.</p>
<p>systemd-nspawn apparently also supports doing this map at runtime without
modifying the files on disk, but I don’t think it works right with some of
the options mentioned in the next step.</p>
</li>
<li><p>The <code>/etc/systemd/nspawn/github-runner.nspawn</code> file lets us set defaults for
our machine when systemd-nspawn boots it.</p>
<p>It’s important this file is created <em>after</em> the chown above. Otherwise, that
command will use these options and the <code>[Files]</code> section interferes with
the the chown. If you need to redo the chown, comment out that section
temporarily.</p>
<pre class="code ini full-width literal-block"><code><span class="k">[Exec]</span><span class="w">
</span><span class="na">PrivateUsers</span><span class="o">=</span><span class="s">570097664:131072</span><span class="w">
</span><span class="na">ResolvConf</span><span class="o">=</span><span class="s">replace-host</span><span class="w">
</span><span class="na">LinkJournal</span><span class="o">=</span><span class="s">try-host</span><span class="w">
</span><span class="k">[Files]</span><span class="w">
</span><span class="na">ReadOnly</span><span class="o">=</span><span class="s">true</span><span class="w">
</span><span class="na">Volatile</span><span class="o">=</span><span class="s">overlay</span><span class="w">
</span><span class="na">BindReadOnly</span><span class="o">=</span><span class="s">/run/lvm-cache-friend</span></code></pre>
<p><code>BindReadOnly</code> is the directory containing the socket to <code>lvm-cache-friend</code>,
the program that creates LVM snapshots and mounts them into the namespace.</p>
<p><code>Volatile</code> will overlay a temporary filesystem <em>over</em> the machine when it
runs. Changes are made to that temporary filesystem and not persisted.</p>
<p><code>PrivateUsers</code> contains a range, <em>twice</em> as large as the 2^16 default. This is
so that the container can allocate an entire 2^16 to Podman as part of
running workflows.</p>
<p>Part of the Dockerfile<a class="footnote-reference superscript" href="#footnote-5" id="footnote-reference-5" role="doc-noteref"><span class="fn-bracket">[</span>¶<span class="fn-bracket">]</span></a> was to delegate subordinate user and groups
to allow the GitHub runner user (in the machine) to map the range of uid
(65536-131072] from the machine into a user namespace for Podman. In the
host, that will be the last half of the 131072 length range.</p>
<aside class="footnote-list superscript">
<aside class="footnote superscript" id="footnote-5" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-5">¶</a><span class="fn-bracket">]</span></span>
<pre class="code shell full-width literal-block"><code><span class="nb">echo</span><span class="w"> </span>ghrunner:65536:65536<span class="w"> </span><span class="p">|</span><span class="w"> </span>tee<span class="w"> </span>/etc/subuid<span class="w"> </span>/etc/subgid</code></pre>
</aside>
</aside>
<p>This way, workflows that run Podman can create their own user
namespace by using the second 2^16 that we give from the host to the
machine. That’s why the host uid range is 131072; for those two 2^16 chunks.</p>
<table>
<tbody>
<tr><td><p>570097664</p></td>
<td><p>570163200</p></td>
<td><p>570228736</p></td>
<td><p><em>host</em></p></td>
</tr>
<tr><td><p>0</p></td>
<td><p>65536</p></td>
<td><p>131072</p></td>
<td><p><em>machine</em></p></td>
</tr>
<tr><td><p></p></td>
<td><p>0</p></td>
<td><p>65536</p></td>
<td><p><em>podman</em></p></td>
</tr>
</tbody>
</table>
</li>
<li><p>There is also an override for the <em>service</em> that runs systemd-nspawn for this
machine. (The configuration above was for systemd-nspawn. Below, appends to
the configuration for the systemd service for systemd-nspawn.)</p>
<p>It’s easiest to create and modify this with <code>systemctl edit
systemd-nspawn@github-runner</code>. (It will end up creating / modifying a file
probably at
<code>/etc/systemd/system/systemd-nspawn@github-runner.service.d/override.conf</code>.)</p>
<pre class="code ini full-width literal-block"><code><span class="k">[Service]</span><span class="w">
</span><span class="na">Restart</span><span class="o">=</span><span class="s">always</span><span class="w">
</span><span class="na">Environment</span><span class="o">=</span><span class="s">SYSTEMD_SECCOMP=0</span></code></pre>
<p>The <code>Restart</code> setting here is sort of the final piece of the puzzle for
stateless self hosted GitHub runners.</p>
<ol class="arabic simple">
<li><p>The <code>--ephemeral</code> option to <code>config.sh</code> ensures only one job is run before the runner service quits.</p></li>
<li><p><code>OnSuccess=poweroff.target</code> for the service in the machine will
shut down the machine when that service quits.</p></li>
<li><p><code>Restart=always</code> in <code>systemd-nspawn@github-runner.service</code> on the host
ensures that the machine will boot up after shutting down.</p></li>
<li><p><code>Volatile=overlay</code> makes sure changes in the machine are made to a
temporary file system and not persisted between boots.</p></li>
</ol>
<p>The <code>Restart=always</code> option is only while the service is active. If you stop
systemd-nspawn by deactivating the service with <code>systemctl stop
systemd-nspawn@github-runner.service</code>, then it won’t restart the machine.
This is what we want.</p>
<p>That configuration also sets the <code>SYSTEMD_SECCOMP=0</code> environment variable.
Without it, something complains about keyrings and doesn’t work. I don’t
understand any of what’s going on there. There might be a more responsible
way to fix that. And I’m sure I’ll get around to figuring it out right after
I stop disabling selinux.</p>
</li>
</ol>
<p>Those steps set up the machine. It should be bootable with …</p>
<pre class="code shell full-width literal-block"><code>systemctl<span class="w"> </span>start<span class="w"> </span>systemd-nspawn@github-runner.service</code></pre>
<p>… or start it and set it to start when the host boots with …</p>
<pre class="code shell full-width literal-block"><code>systemctl<span class="w"> </span><span class="nb">enable</span><span class="w"> </span>--now<span class="w"> </span>systemd-nspawn@github-runner.service</code></pre>
<p>There’s just a bit of LVM setup and then also running the service that makes and
mounts LVM snapshots on the host; which was kinda the entire point of this thing
to begin with.</p>
</section><section id="lvm">
<h2>LVM<a class="self-link" title="link to this section" href="#lvm"></a></h2>
<p>For the cache, we need a LVM volume for the service to use as a default to make
snapshots.</p>
<p>This makes a thin <em>pool</em> in an existing volume group called banana.</p>
<pre class="code shell full-width literal-block"><code>lvcreate<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--type<span class="w"> </span>thin-pool<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--size<span class="w"> </span>30G<span class="w"> </span><span class="se">\
</span><span class="w"> </span>banana/friend-pool</code></pre>
<p>And this makes create a thin <em>volume</em> in that pool.</p>
<pre class="code shell full-width literal-block"><code>lvcreate<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--thinpool<span class="w"> </span>banana/friend-pool<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--virtualsize<span class="w"> </span>3G<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--name<span class="w"> </span>friend-default<span class="w"> </span><span class="se">\
</span><span class="w"> </span>--addtag<span class="w"> </span>friend:default</code></pre>
<p>The service identifies this as the default volume for snapshots by the tag
<code>friend:default</code> (the volume name and group don’t matter). It will use that
volume if no other snapshot exists as a suitable choice.</p>
<p><code>--virtualsize</code> seems to prevent the volume from growing past that size.
But I’m not sure what setting <code>--size</code> on the pool does. It seems you can
still over-provision it with thin volumes. So you can create <em>more</em> than ten 3G
volumes in a 30G pool, but it seems there is still a 30G limit on the actual
usage of all combined volumes.
So if some IO to a volume would causes the pool to
use more than its 30G limit, then that IO fails with some error. At least in my
<em>limited</em> testing.</p>
<p>Something else that might be worth looking into is the <code>--discards</code> option to
<code>lvcreate</code>. There are some details around thin volumes actually reporting to
use less data when files are deleted. Our service for creating and mounting
snapshots will mount volumes with the <code>discard</code> option by default. So far, that seems to
be good enough. But there might be surprises I’m not aware of.</p>
<p>If it’s not super clear yet; I didn’t end up looking at LVM <em>that much</em> as part of
this project. It was mostly struggling with stupid things in GitHub and Podman.
There’s a whole <code>lvmthin.7</code> man page that I haven’t read. And I don’t want to
either because then I’ll realize that I did everything wrong and I’ll have to
start over.<a class="footnote-reference superscript" href="#footnote-6" id="footnote-reference-6" role="doc-noteref"><span class="fn-bracket">[</span>#<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-6" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-6">#</a><span class="fn-bracket">]</span></span>
<p>But that’s why we like LVM. It works even if you’re dumb and can’t read
man pages and don’t know what you’re doing.</p>
</aside>
</aside>
</aside>
<aside class="aside">
<p><em>Protip</em> It’s easy enough to grow volumes with something like <code>lvresize
--size +1G</code> and then <code>xfs_grow</code> if you format it as xfs.</p>
</aside>
<p>After creating our default volume, we want to format it so we can mount it; I
used xfs.</p>
<pre class="code shell full-width literal-block"><code>mkfs.xfs<span class="w"> </span>/dev/banana/friend-default</code></pre>
<p>Then mount it so we can set the owner to the <code>ghrunner</code> user in the machine so
the runner can use the cache. That user’s uid is 1000. So the we’ll want to use
1000 plus whatever was selected for user namespace of the machine. In the last
section, the machine is using 570097664 so the <code>ghrunner</code> user will be
570098664.</p>
<pre class="code shell full-width literal-block"><code>mount<span class="w"> </span>/dev/banana/friend-default<span class="w"> </span>/mnt<span class="w">
</span>chown<span class="w"> </span><span class="m">570098664</span>:570098664<span class="w"> </span>/mnt<span class="w">
</span>umount<span class="w"> </span>/mnt</code></pre>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="container231dm.svg">
<img alt="container231lm.svg" src="container231lm.svg" />
</picture>
</section><section id="lvm-cache-friend-service">
<h2>lvm-cache-friend.service<a class="self-link" title="link to this section" href="#lvm-cache-friend-service"></a></h2>
<p>The program that makes and mounts snapshots is a Python script that depends
only on the standard library and some LVM commands being in your PATH.</p>
<p>My service file is really simple.</p>
<pre class="code ini full-width literal-block"><code><span class="k">[Service]</span><span class="w">
</span><span class="na">Type</span><span class="o">=</span><span class="s">simple</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">/usr/local/bin/lvm-cache-friend.py</span><span class="w">
</span><span class="na">Restart</span><span class="o">=</span><span class="s">on-failure</span><span class="w">
</span><span class="k">[Install]</span><span class="w">
</span><span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span></code></pre>
<p>It even has <code>Type=simple</code>. So you know it’s not very complicated.</p>
<p>The runner communicates to the service through a unix socket at
<code>/run/lvm-cache-friend/socket</code>.
The parent directory <code>/run/lvm-cache-friend</code> is
bind mounted into the runner’s machine. It should exist before either this
service or the machine boots. Since <code>/run</code> is usually a temporary filesystem,
systemd is often configured to set up these paths at boot.
I haven’t tested this, but I think systemd will do this for us if we add a file
somewhere under <code>/etc/tmpfiles.d</code> containing something like:</p>
<pre class="code ini full-width literal-block"><code><span class="na">d /run/lvm-cache-friend 0755 root root -</span></code></pre>
<p>The script has some options. By default it will
add the tag <code>friend:snapshot</code> to every snapshot it makes.
And, for namespacing, prefix any tags that peers add or search for with <code>friend:cache:</code>.</p>
<p>But otherwise, that’s it.</p>
<p>Start the service with.</p>
<pre class="code shell full-width literal-block"><code>systemctl<span class="w"> </span><span class="nb">enable</span><span class="w"> </span>--now<span class="w"> </span>lvm-cache-friend</code></pre>
<p>And if you view the logs with <code>journalctl -efu lvm-cache-friend</code> or something
you should hopefully be able to confirm the default volume and socket path being used.</p>
<pre class="code shell full-width literal-block"><code>info<span class="w"> </span>default<span class="w"> </span>lv<span class="w"> </span>banana<span class="w"> </span>friend-default<span class="w">
</span>info<span class="w"> </span>listening<span class="w"> </span>on<span class="w"> </span>/run/lvm-cache-friend/socket</code></pre>
</section><section id="quick-summary">
<h2>quick summary<a class="self-link" title="link to this section" href="#quick-summary"></a></h2>
<p>The GitHub workflow contains something like this:</p>
<pre class="code yaml full-width literal-block"><code><span class="nt">jobs</span><span class="p">:</span><span class="w">
</span><span class="nt">meme</span><span class="p">:</span><span class="w">
</span><span class="nt">runs-on</span><span class="p">:</span><span class="w"> </span><span class="l-Scalar-Plain">self-hosted</span><span class="w">
</span><span class="nt">env</span><span class="p">:</span><span class="w">
</span><span class="nt">XDG_DATA_HOME</span><span class="p">:</span><span class="w"> </span><span class="l-Scalar-Plain">/home/ghrunner/cache/xdg</span><span class="w">
</span><span class="nt">steps</span><span class="p">:</span><span class="w">
</span><span class="p-Indicator">-</span><span class="w"> </span><span class="nt">uses</span><span class="p">:</span><span class="w"> </span><span class="l-Scalar-Plain">actions/checkout@v3</span><span class="w">
</span><span class="p-Indicator">-</span><span class="w"> </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="p-Indicator">|</span><span class="w">
</span><span class="no">ncat -U /run/lvm-cache-friend/socket <<EOF</span><span class="w">
</span><span class="no">mount /home/ghrunner/cache</span><span class="w">
</span><span class="no">> linux ${{ hashFiles('Cargo.lock','Cargo.toml') }}</span><span class="w">
</span><span class="no">> linux ${{ github.ref_name }}</span><span class="w">
</span><span class="no">< linux ${{ github.ref_name }} ${{ hashFiles('Cargo.lock','Cargo.toml') }}</span><span class="w">
</span><span class="no">EOF</span><span class="w">
</span><span class="p-Indicator">-</span><span class="w"> </span><span class="nt">run</span><span class="p">:</span><span class="w"> </span><span class="l-Scalar-Plain">podman build .</span></code></pre>
<p>This connects to the lvm-cache-friend service on the host over a unix socket and
asks it to create an LVM snapshot and mount it at <code>/home/ghrunner/cache</code> on the
machine.</p>
<p>Setting the <code>XDG_DATA_HOME</code> environment variable tells Podman to use a path
somewhere in the mounted volume for container storage instead of
<code>~/.local/share/containers/storage</code>.</p>
<p>Running <code>podman build</code> after making a change to the source code, we see the
runner use the cache part of the way; up until copying the changed code at step
five.</p>
<pre class="code full-width literal-block"><code>STEP 1/7: FROM registry.suse.com/bci/rust:1.66 AS build
STEP 2/7: WORKDIR /graph-do-smell
--> Using cache 04e88ce4a178ccb9b25c0c9ccb362e283083fe94e98ae19dcc0ade8896753a02
--> 04e88ce4a17
STEP 3/7: COPY Cargo.toml Cargo.lock .
--> Using cache c150b0fdb80a8537f513da3178453130692ff19e5dbb94dd33765d9cd323effe
--> c150b0fdb80
STEP 4/7: RUN cargo fetch --locked --target x86_64-unknown-linux-gnu
--> Using cache c81843da6bc54b4781629c28a62b8fdc64d1997cb952b6b31275c8d0ee1a211f
--> c81843da6bc
STEP 5/7: COPY . .
--> 5848547a2fc
STEP 6/7: RUN cargo build --locked --release --offline
Compiling proc-macro2 v1.0.51
Compiling unicode-ident v1.0.6
Compiling quote v1.0.23
Compiling syn v1.0.109
...</code></pre>
<p>Pretty freakin’ neato.</p>
<aside class="aside">
<p>Fun fact; if the runner tries to run <code>podman build</code> without using the cache, it will
likely fail while trying to set up an overlay. I think you aren’t allowed to
make an overlay backed by another overlay. And since the machine was configured
with <code>Volatile=overlay</code>, it’s running with
an overlay over its root filesystem. Podman notices this and tries to use an
overlay with fuse instead of overlayfs, but it can’t I guess because
<code>/dev/fuse</code> isn’t available to the machine or something. But, Podman <em>can</em>
create overlays without fuse if it uses the mount provided by
lvm-cache-friend. Since that filesystem type isn’t <code>overlay</code> like the rest of
the machine.</p>
</aside>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="container847dm.svg">
<img alt="container847lm.svg" src="container847lm.svg" />
</picture>
</section><section id="graph-do-smell">
<h2>graph-do-smell<a class="self-link" title="link to this section" href="#graph-do-smell"></a></h2>
<p>I made <a class="reference external" href="https://github.com/sqwishy/graph-do-smell">a repository on GitHub for this at
github.com/sqwishy/graph-do-smell</a>. The Dockerfile and the Python
script in the <code>misc/</code> directory show a few more details about building
the GitHub runner machine and running the lvm-cache-friend service.</p>
<p>When I started this, I wanted an uncomplicated program to test this with so I
picked a weird Rust thing I made a long time ago but never did anything with.
All it does is fetch and parse a web page and provide a GraphQL “API” to the HTML
document.</p>
<p>For example, getting the list of titles and links from the front page of
“hacker” “news”.</p>
<pre class="code shell full-width literal-block"><code>cargo<span class="w"> </span>run<span class="w"> </span>--<span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'{
get(url: "https://news.ycombinator.com/")
{
select(select: ".athing .titleline > a")
{ text href }
}
}'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><<span class="w"> </span>/dev/null<span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span>.get.select<span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"Copyright Registration Guidance: Works containing material generated by AI"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://www.federalregister.gov/documents/2023/03/16/2023-05321/copyright-registration-guidance-works-containing-material-generated-by-artificial-intelligence"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"Show HN: GPT Repo Loader – load entire code repos into GPT prompts"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://github.com/mpoon/gpt-repository-loader"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"Transformers.js"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://xenova.github.io/transformers.js/"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"A token-smuggling jailbreak for ChatGPT-4"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://twitter.com/alexalbert__/status/1636488551817965568"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"Show HN: Alpaca.cpp – Run an Instruction-Tuned Chat-Style LLM on a MacBook"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://github.com/antimatter15/alpaca.cpp"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"Web Stable Diffusion"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://github.com/mlc-ai/web-stable-diffusion"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span>...</code></pre>
<p>Or the items from the index of <a class="reference external" href="https://froghat.ca">froghat.ca</a>.</p>
<pre class="code shell full-width literal-block"><code>cargo<span class="w"> </span>run<span class="w"> </span>--<span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'{
get(url: "https://froghat.ca")
{
select(select: "li:not(.delimiter)")
{
title: select(select: "a.title") { text href }
time: select(select: "time") { datetime: attr(attr: "datetime") }
}
}
}'</span><span class="w"> </span><<span class="w"> </span>/dev/null<span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span>.get.select<span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"title"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"Why is it four clicks to view GitHub workflow logs?"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://froghat.ca/2023/02/github-clicks"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="s2">"time"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"datetime"</span>:<span class="w"> </span><span class="s2">"2023-02-21"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"title"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"Think Helvetica"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://froghat.ca/2023/02/think-helvetica"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="s2">"time"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"datetime"</span>:<span class="w"> </span><span class="s2">"2023-02-06"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"title"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"CEO Robrick-Patbert Froghat’s email to froghat.ca employees"</span>,<span class="w">
</span><span class="s2">"href"</span>:<span class="w"> </span><span class="s2">"https://froghat.ca/2022/11/to-froghat.ca-employees"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="s2">"time"</span>:<span class="w"> </span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"datetime"</span>:<span class="w"> </span><span class="s2">"2022-11-30"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span>...</code></pre>
<p>So the workflow for graph-do-smell builds this Rust program in Podman on my
self-hosted runner using the funny cache.</p>
<p>Anyway, I had imagined this post to be <em>much</em> shorter. Just featuring
snapshots of thin volumes with LVM and using it from a workflow.
There ended up being some surprises, like troubleshooting Podman and
figuring out the options for the GitHub runner and setting it up to re-register
itself every single time it starts up after doing one single job.</p>
<p>After all that, I don’t even know how to end this post.
I can inventory the things and explain them and how they’re barely held
together; but I can’t convey the experience of it working over time – which would be kind of the neat part.
And every time I read this, my brain needs a shower.
Maybe it’s trying to remind me that computers are for looking at cat pictures.
I guess I can pretend like this unsatisfying conclusion is
some poetic parallel to the described experience itself.</p>
<p>On the bright side, the GitHub token this thing is using will expire in a few
days and the actual literal daemon will stop working and disappear in time
like all things.</p>
</section><p>If I see another blog post with a stock photo of a shipping container I’m gonna heckin’ lose it.</p>
2023-03-16T12:00:00-07:00https://froghat.ca/2023/02/github-clicksWhy is it four clicks to view GitHub workflow logs?2023-02-21T12:00:00-08:00sqwishy<p>For about three weeks I’ve been trying to write a blog post about caching for
GitHub runners using <abbr title="Logical Volume Manager">LVM</abbr> snapshots. Of the time
spent so far, maybe 3% has been having fun and actually doing things
with LVM. The rest has just been making angry faces at GitHub actions and GitHub
runners.</p>
<aside class="aside">
<p>I don’t like this kind of content. The kind I’m writing now in this post. It’s
miserable. I’m trying to peel off the angry GitHub rant from the other
post I’m writing and <em>containerize</em> it here – where nobody will find it and
where it can’t hurt anyone – and where it won’t detract from the
spirit of the post that I want to write that tells a story about using
LVM and Linux to do a cool weird thing and how computers <em>can be fun sometimes</em>.</p>
<p>The downside is that this isn’t a nice post with a happy side. Like my
<a class="reference external" href="/2020/09/phone-rant">phone rant</a> with the predictive text or swapping in
a new pattern for underneath your phone case to show off to your friends
tomorrow at school.</p>
</aside>
<p>When a GitHub workflow runs, usually when you push commits with git, it runs
some programs on a computer and it will either do the things or fail. Also,
GitHub has logs about the things it tried to do.</p>
<p>This is how to view those logs. Starting from the index of workflow runs (the actions tab in the project).</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="1d.png">
<img alt="First, click the workflow run." src="1l.png" />
</picture>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="2d.png">
<img alt="Second, click the job in the workflow." src="2l.png" />
</picture>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="3d.png">
<img alt="Third, click the cog in the corner to reveal a dropdown." src="3l.png" />
</picture>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="4d.png">
<img alt="Finally, click "View raw logs" in the dropdown." src="4l.png" />
</picture>
<aside class="aside">
<p><em>Bonus Meme:</em> I was taking screenshots in light mode and it
looked like this panel was still in dark mode. I thought it was setting the
theme with JavaScript or something and was stateful so I tried refreshing a
bunch. But I think it’s just like this. I think this is the light mode.</p>
<a class="reference external image-reference" href="github-light-mode.png"><img alt="github-light-mode.png" src="github-light-mode.png" /></a>
</aside>
<p>It seems like there could be a shortcut for this. Nearly always, I’m wanting to
view the logs for the last job that failed.</p>
<p>On occasion, I want logs for the most recent workflow run even if it
succeeded. These workflows have a single job so picking the job is easy when
there’s just one option. Even then, showing the logs for all the jobs wouldn’t
be terrible.</p>
<p>The screenshots don’t really do it justice, but if you aren’t as familiar with
GitHub actions is as someone who authors workflows (a fucking nerd), it’s not
impossible to get lost and click the ellipsis button in the top right at either
step two or three. It’s got the same general <em>top right of the whatever
area</em> for your spatial memory to sort out.</p>
<p>Maybe they’ll figure out AI so I can just ask it to click the things
to show me logs.</p>
<hr class="docutils" />
<p>At some point I got baited into trying to fetch the logs
with GitHub’s REST API. You have to add an access token so that <code>curl</code>, or
whatever program you’re using to make requests, can authenticate. GitHub scolds
you if you try to add a token with no expiry, so I figured “Okay, I’ll stop
trying to fight you, GitHub. I’ll do what you say and try to keep an open mind.
After all, I can’t stop progress or whatever this shit is.”</p>
<p>I made the token set to expire in seven days and immediately got four emails
telling me about the thing I just did. Two telling me that I had added the
token and two that it was expiring in seven days. By the end of the week there
were two more emails telling me again that it was going to expire and then
two more once they had expired.</p>
<p>It felt like one of those windows installers that tries to also install a
browser toolbar or something. And it says “don’t uncheck this to skip leaving
this offer for a 14-day demo of Norton AntiVirus uninstalled from your
computer”. You might have agreed to it doing the thing that it does, but it’s
not what you wanted and it really feels like it tricked you into giving it
permission to be annoying at you.</p>
<hr class="docutils" />
<p>Also, this is what you see when you expand one of those steps in the workflow
job summary thing. When I first saw it, I thought it meant the step failed and
nothing was written to standard error. But they <em>all</em> say that all the time
even if the whole thing succeeded. I don’t know why. I guess it’s just one of
nature’s secrets.</p>
<img alt="github-step-error.png" src="github-step-error.png" />
<p>:O blogger reacts to the number of clicks <em>exploding head emoji</em></p>
2023-02-21T12:00:00-08:00https://froghat.ca/2023/02/think-helveticaThink Helvetica2023-02-06T12:00:00-08:00sqwishy<p>A few years ago, it started becoming a thing for “responsive” websites to
have horizontal scroll bars. But, when you scrolled over, the new area revealed
itself to be empty. You could scroll up and down the mysterious region of
whitespace; wondering why it’s there; imagining its potential. What would it be
when it grows up?</p>
<p>Until you’d arrive at some table or iframe overflowing its
container on the main part of the page and spilling out into this otherwise
void and perfect landscape. It was just an accident.</p>
<video src="i-love-scrolling.webm" controls></video><p>These days, many web browsers will hide scroll bars until you hover or try to
interact with them. But back in ancient times, when these wacky empty regions
and horizontal scrolling had just started becoming more common,
hiding the scroll bar was <em>a new thing</em> that macOS started doing. And since
every web developer has been using macOS since – I don’t know, around the time
the Zune came out I guess – web developers didn’t notice horizontal scroll
bars when they made pages that weren’t meant to have them.</p>
<p>Today, macOS is still quite popular. And along with horizontal scroll bars,
there are websites all over the place with stylesheets that look like this.</p>
<pre class="code css literal-block"><code><span class="c">/* from the website in the video above */</span><span class="w">
</span><span class="nt">font-family</span><span class="o">:</span><span class="w"> </span><span class="nt">-apple-system</span><span class="o">,</span><span class="w"> </span><span class="nt">BlinkMacSystemFont</span><span class="o">,</span><span class="w">
</span><span class="s2">"Segoe UI"</span><span class="o">,</span><span class="w"> </span><span class="nt">Roboto</span><span class="o">,</span><span class="w"> </span><span class="nt">Oxygen-Sans</span><span class="o">,</span><span class="w">
</span><span class="nt">Ubuntu</span><span class="o">,</span><span class="w"> </span><span class="nt">Cantarell</span><span class="o">,</span><span class="w">
</span><span class="s2">"Helvetica Neue"</span><span class="o">,</span><span class="w"> </span><span class="nt">sans-serif</span><span class="w">
</span><span class="c">/* GitHub */</span><span class="w">
</span><span class="nt">font-family</span><span class="o">:</span><span class="w"> </span><span class="nt">-apple-system</span><span class="o">,</span><span class="w"> </span><span class="nt">BlinkMacSystemFont</span><span class="o">,</span><span class="w">
</span><span class="s2">"Segoe UI"</span><span class="o">,</span><span class="w"> </span><span class="s2">"Noto Sans"</span><span class="o">,</span><span class="w">
</span><span class="nt">Helvetica</span><span class="o">,</span><span class="w"> </span><span class="nt">Arial</span><span class="o">,</span><span class="w"> </span><span class="nt">sans-serif</span><span class="o">,</span><span class="w">
</span><span class="s2">"Apple Color Emoji"</span><span class="o">,</span><span class="w"> </span><span class="s2">"Segoe UI Emoji"</span><span class="w">
</span><span class="c">/* stack overflow */</span><span class="w">
</span><span class="nt">font-family</span><span class="o">:</span><span class="w"> </span><span class="nt">-apple-system</span><span class="o">,</span><span class="w"> </span><span class="nt">BlinkMacSystemFont</span><span class="o">,</span><span class="w">
</span><span class="s2">"Segoe UI Adjusted"</span><span class="o">,</span><span class="w"> </span><span class="s2">"Segoe UI"</span><span class="o">,</span><span class="w">
</span><span class="s2">"Liberation Sans"</span><span class="o">,</span><span class="w"> </span><span class="nt">sans-serif</span><span class="w">
</span><span class="c">/* some random website I started doing contract
* work on using the latest trendy js/css stack */</span><span class="w">
</span><span class="nt">font-family</span><span class="o">:</span><span class="w"> </span><span class="nt">-apple-system</span><span class="o">,</span><span class="w"> </span><span class="nt">BlinkMacSystemFont</span><span class="o">,</span><span class="w">
</span><span class="nt">Segoe</span><span class="w"> </span><span class="nt">UI</span><span class="o">,</span><span class="w"> </span><span class="nt">Roboto</span><span class="o">,</span><span class="w"> </span><span class="nt">Oxygen</span><span class="o">,</span><span class="w"> </span><span class="nt">Ubuntu</span><span class="o">,</span><span class="w">
</span><span class="nt">Cantarell</span><span class="o">,</span><span class="w"> </span><span class="nt">Fira</span><span class="w"> </span><span class="nt">Sans</span><span class="o">,</span><span class="w"> </span><span class="nt">Droid</span><span class="w"> </span><span class="nt">Sans</span><span class="o">,</span><span class="w">
</span><span class="nt">Helvetica</span><span class="w"> </span><span class="nt">Neue</span><span class="o">,</span><span class="w"> </span><span class="nt">sans-serif</span></code></pre>
<p>I don’t know if it’s because I’m getting older or because I melted my eyes with
laser eye surgery; but, in the last couple years, I’ve been changing my system
font like I did my desktop wallpaper and compiz themes when I was a teenager.</p>
<p>After some time, I started suspecting that websites weren’t using my preferred
fonts. It turns out, my system fonts were being passed over in favour of this
really trendy new font called -apple-system.</p>
<p>I’ve heard that -apple-system necessarily points to a font called <a class="reference external" href="https://en.wikipedia.org/wiki/San_Francisco_(sans-serif_typeface)">San
Francisco</a>. And that you must call it -apple-system instead of its actual font
name. One explanation suggested for that is to enable things like
monospaced numerals. But the <a class="reference external" href="https://developer.mozilla.org/en-US/docs/Web/CSS/font-feature-settings">font-feature-settings</a> CSS property can specify
for monospaced numerals with other fonts without aliases. Probably it only
works when fonts support the specific features requested, but it would be
surprising if Apple couldn’t figure out how to support it in San Francisco the
same way everyone else does in theirs. So this explanation doesn’t make a lot
of sense to me.</p>
<p>It’s also possible that -apple-system is an alias for a font particular to the
user or device. For instance, serif is different between my laptop and my phone
because my phone doesn’t have Papyrus.</p>
<p>In that case, using -apple-system in CSS makes sense as a way for macOS users to
have websites that look nice by using their preferred system font. Although,
on systems with no -apple-system font, browsers will fall back to Segoe UI or
Roboto or Helvetica or Arial; and use one of those basic bitch fonts before
ever getting to the system’s configured sans-serif font. Which seems
inconsiderate toward non macOS users, but would surprise me less than the
alternative hypothetical where Apple created a
Helvetica-based font named San Francisco that you can’t use by its actual name
and must instead call it -apple-system and which macOS users enjoy so
gosh darn much that they’ve decided every non-macOS device that visits a website
should use a font that best reproduces Helvetica.</p>
<p>Either way, I don’t really like Helvetica that much.</p>
<section id="config-fontconfig-fonts-conf-snippet">
<h2><code>.config/fontconfig/fonts.conf</code> snippet<a class="self-link" title="link to this section" href="#config-fontconfig-fonts-conf-snippet"></a></h2>
<p>Mapping -apple-system to sans-serif in fonts.conf seems to help de-Helvetica
the web.</p>
<pre class="code xml literal-block"><code><span class="nt"><alias></span><span class="w">
</span><span class="nt"><family></span>-apple-system<span class="nt"></family></span><span class="w">
</span><span class="nt"><prefer><family></span>sans-serif<span class="nt"></family></prefer></span><span class="w">
</span><span class="nt"></alias></span></code></pre>
</section><section id="config-fontconfig-fonts-conf">
<h2><code>.config/fontconfig/fonts.conf</code><a class="self-link" title="link to this section" href="#config-fontconfig-fonts-conf"></a></h2>
<p>For context, this is an example of a user fonts.conf where that snippet above
is used.</p>
<pre class="code xml literal-block"><code><span class="cp"><?xml version="1.0"?></span><span class="w">
</span><span class="cp"><!DOCTYPE fontconfig SYSTEM "urn:fontconfig:fonts.dtd"></span><span class="w">
</span><span class="nt"><fontconfig></span><span class="w">
</span><span class="nt"><alias></span><span class="w">
</span><span class="nt"><family></span>-apple-system<span class="nt"></family></span><span class="w">
</span><span class="nt"><prefer><family></span>sans-serif<span class="nt"></family></prefer></span><span class="w">
</span><span class="nt"></alias></span><span class="w">
</span><span class="nt"><alias></span><span class="w">
</span><span class="nt"><family></span>ui-monospace<span class="nt"></family></span><span class="w">
</span><span class="nt"><prefer><family></span>monospace<span class="nt"></family></prefer></span><span class="w">
</span><span class="nt"></alias></span><span class="w">
</span><span class="nt"><alias></span><span class="w">
</span><span class="nt"><family></span>monospace<span class="nt"></family></span><span class="w">
</span><span class="nt"><prefer><family></span>Comic<span class="w"> </span>Neue<span class="nt"></family></prefer></span><span class="w">
</span><span class="nt"></alias></span><span class="w">
</span><span class="nt"><alias></span><span class="w">
</span><span class="nt"><family></span>sans-serif<span class="nt"></family></span><span class="w">
</span><span class="nt"><prefer><family></span>Comic<span class="w"> </span>Neue<span class="nt"></family></prefer></span><span class="w">
</span><span class="nt"></alias></span><span class="w">
</span><span class="nt"><alias></span><span class="w">
</span><span class="nt"><family></span>serif<span class="nt"></family></span><span class="w">
</span><span class="nt"><prefer><family></span>Comic<span class="w"> </span>Neue<span class="nt"></family></prefer></span><span class="w">
</span><span class="nt"></alias></span><span class="w">
</span><span class="nt"></fontconfig></span></code></pre>
</section><section id="before">
<h2>before<a class="self-link" title="link to this section" href="#before"></a></h2>
<img alt="before.png" src="before.png" />
</section><section id="after">
<h2>after<a class="self-link" title="link to this section" href="#after"></a></h2>
<img alt="after.png" src="after.png" />
<hr class="docutils" />
<p>Also, I don’t have anything against specific fonts on websites generally. It
think they’re cool when they’re on purpose. That’s how I find new fonts that I
like. Like <a class="reference external" href="https://fonts.google.com/specimen/Quicksand">Quicksand</a>. And <a class="reference external" href="https://en.wikipedia.org/wiki/Brandon_Grotesque">Brandon</a> on <a class="reference external" href="https://www.kingarthurbaking.com/pro/formulas/sourdough-seed-bread">kingarthurbaking.com</a>. It’s nice to see
a thing that looks some way because someone gave a flippin’ hoot about it
looking that way.</p>
</section><p>tldr; map <span class="docutils literal"><span class="pre">-apple-system</span></span> to <span class="docutils literal"><span class="pre">sans-serif</span></span> in your <span class="docutils literal">fonts.conf</span> for better times</p>
2023-02-06T12:00:00-08:00https://froghat.ca/2022/11/to-froghat.ca-employeesCEO Robrick-Patbert Froghat’s email to froghat.ca employees2022-11-30T12:00:00-08:00sqwishy<!-- Today we’re announcing the hardest change we have had to make at Stripe to
date. We’re reducing the size of our team by around 14% and saying goodbye
to many talented Stripes in the process. If you are among those impacted,
you will receive a notification email within the next 15 minutes.
For those of you leaving: we’re very sorry to be taking this step and John
and I are fully responsible for the decisions leading up to it. -->
<!-- Today we're announcing the hardest shift we have had to make in the history
of MessageBird. We’re reducing the size of our team overall by around 31%
(including the announcement on Oct 31), saying goodbye to many talented
Birds. If you are part of the Birds being impacted you will receive a
notification within 15 minutes to your MessageBird and private email.
For those of you leaving: I’m very sorry to be taking this step and I take
full responsibility for the decisions leading up to it. I hope that the
clarity in this note will show you my full ownership for this decision and
the path forward. -->
<p>Today, we’re announcing the hardest jump we’ve had to make at froghat.ca ever.
We’re reducing the size of our team by roughly 100% and saying goodbye to many
talented Froggies. If you’re one of the Froggies being impacted, you will
receive a croak notification to your Happy Hopper app within the next 15 minutes.</p>
<p>For those of you leaving: I’m vewy sowy to be taking this step and I
take full responsibiwity for the decisions leading up to it. <span class="nowrap">o(>ω<)o</span></p>
<section id="how-we-re-handling-departures">
<h2>how we’re handling departures<a class="self-link" title="link to this section" href="#how-we-re-handling-departures"></a></h2>
<!-- How we’re handling departures
Around 14% of people at Stripe will be leaving the company. We, the
founders, made this decision. We overhired for the world we’re in (more on
that below), and it pains us to be unable to deliver the experience that we
hoped that those impacted would have at Stripe.
Most importantly, while this is definitely not the separation we would have
wanted or imagined when we were making hiring decisions, we want everyone
that is leaving to know that we care about you as former colleagues and
appreciate everything you’ve done for Stripe. In our minds, you are valued
alumni. (In service of that, we’re creating alumni.stripe.com email
addresses for everyone departing, and we’re going to roll this out to all
former employees in the months ahead.)
Career support. We’ll cover career support, and do our best to connect
departing employees with other companies. We’re also creating a new tier of
extra large Stripe discounts for anyone who decides to start a new business
now or in the future.
Our message to other employers is that there are many truly terrific
colleagues departing who can and will do great things elsewhere. Talented
people come to Stripe because they’re attracted to hard infrastructure
problems and complex challenges. Today doesn’t change that, and they would
be fantastic additions at almost any other company. -->
<!-- Around 31% of people at MessageBird will have left the company by the end of
this year, compared to our headcount numbers in October of this year. As I
said before, I take full responsibility for this decision. It is incredibly
painful to realise that we could not live up to our expectations of
providing a successful long-term career in the nest to those now impacted.
How we’re handling departures
Please know that this is definitely not a decision I imagined we would make
when we hired you. We know that you would have continued to contribute in an
amazing way under different circumstances. We care about you as former
colleagues and birds in our extended nest, appreciating everything you’ve
done for the company. We would like to encourage you to stay connected with
us so we can share updates and also contact you in the future should an
opportunity to return to MessageBird arise and encourage you to join our
alumni community. We also encourage companies who are looking for amazing
talent to consider our departing Birds; they will thrive in any fast-paced,
high growth environment. -->
<p>Around 100% of people at froghat.ca will be leaving the company.
For those that are impacted, it pains us that we could not live up to their
expectations as company where they live out the rest of their days in service
to.</p>
<p>Please know that this is definitely not the separation we imagined when we
hired you. But when you have toad the line as long as you have, there’s
nothing left except to take the bandage and ribbit off.</p>
<p>We care about you as former colleagues, and as frogs in our extended pond, and
appreciate everything you’ve done for froghat.ca. We’re creating
alumni.froghat.ca email addresses for all departing Froggies and encourage you
to stay connected with us and our alumni community via the Happy Hopper app.
We’re also creating a new tier of extra large discounts to froghat.ca
merchandise for anyone who decides to start wearing our swag now or in the
future!</p>
<p>We also encourage other employers looking for amazing talent to consider our
departing Froggies. They’re great dealing with bugs, and would be fantastic
additions at <em>almost</em> any other company.</p>
</section><section id="the-great-leap-forward">
<h2>the great leap forward<a class="self-link" title="link to this section" href="#the-great-leap-forward"></a></h2>
<!-- Going forward
People join Stripe because they want to grow the internet economy and boost
entrepreneurship around the world. Times of economic stress make it even
more important that we find innovative ways to help our users grow and adapt
their businesses. -->
<!-- Our path forward;
You joined MessageBird because you are excited about the opportunities we
have to help our customers make communicating with a business as easy as
talking to a friend. In times of uncertainty around us, our customers are
looking for even more scalable and flexible ways of communicating, hence our
unique opportunity to support them through new products and innovations. -->
<p>People join froghat.ca because they are excited to do business things and
company stuff. Uncertain economic times are even more reason to adapt and grow
and scale company and business and deliver more value to customers and users
with products and innovations.</p>
<!-- Today is a sad day for everyone as we say goodbye to a number of talented
colleagues. But we’re ready for a pitched effort ahead, and we’re putting
Stripe on the right footing to face it.
For the rest of this week, we’ll focus on helping the people who are leaving
Stripe. Next week we’ll reset, recalibrate, and move forward. -->
<!-- Today is a sad day for all of us as we say goodbye to a number of talented
Birds. In the next couple of days, we’ll do our best to support those who
are leaving the company. Next week we will reset and move forward.
While the changes today are painful, I know it is the right thing to do to
keep MessageBird’s position strong and use the global headwinds for those
around us to our advantage and to master the future. -->
<p>Today is a sad day for all as we say goodbye to a number of talented Froggies.
This week, we’ll do our best in supporting those leaving froghat.ca who might
be feeling despondent.</p>
<p>Even though it sucks, I know these changes put us on the right pad.
Next week, we’ll reset and hop forward.</p>
<hr class="docutils" />
<p>If you enjoyed this bit of prose; you can find more on the blogs at <a class="reference external" href="https://stripe.com/en-ca/newsroom/news/ceo-patrick-collisons-email-to-stripe-employees">Stripe</a> <a class="reference external" href="https://archive.is/uQNMb">archive.is</a> and <a class="reference external" href="https://blog.messagebird.com/posts/ceo-robert-vis-email-to-messagebird-employees">MessageBird</a> <a class="reference external" href="https://archive.is/40pcM">archive.is</a>.</p>
</section><p>Layoffs at froghat.ca.</p>
2022-11-30T12:00:00-08:00https://froghat.ca/2022/07/fucking-leetcodeFunny leetcode prank for software job interviews #haha🤣🤣🤣💩2022-07-27T12:00:00-07:00sqwishy<p>I don’t encounter a lot of leetcode for job interviews. But recently I had to
send in a couple solutions along with my application. <em>What happened next will shock you!</em> ⚡💀</p>
<p>One solution I submitted was for this problem:</p>
<blockquote>
<p>Write a function which, given a string, finds the length of the longest sub-string without repeating characters.</p>
</blockquote>
<p>They offered two examples:</p>
<ol class="arabic simple">
<li><p>“abcabcdcx” is 4, the length of the longest valid substring “abcd”.</p></li>
<li><p>“bbbbb” is 1 , the length of the longest valid substring “b”.</p></li>
</ol>
<p>To be unambiguous, repeating characters in this case doesn’t mean the same
character in a sequence, like right next to each other. It just means every
character in the substring must only occur once. Otherwise the answer for the
first example would be different.</p>
<p>Here’s the solution I sent:</p>
<pre class="code python full-width literal-block"><code><span class="k">def</span> <span class="nf">longest_non_repeating_substring</span><span class="p">(</span><span class="n">text</span><span class="p">):</span><span class="w">
</span><span class="sd">""" finds the length of the longest sub-string without repeating characters
>>> longest_non_repeating_substring("abcabcdcx")
4
>>> longest_non_repeating_substring("bbbbb")
1
>>> longest_non_repeating_substring("")
0
>>> longest_non_repeating_substring("a")
1
>>> longest_non_repeating_substring("abcde")
5
>>> longest_non_repeating_substring("abcdeeeeee")
5
>>> longest_non_repeating_substring("aaaaaabcde")
5
"""</span><span class="w">
</span> <span class="k">return</span> <span class="nb">max</span><span class="p">(</span><span class="n">non_repeating_substrings</span><span class="p">(</span><span class="n">text</span><span class="p">))</span><span class="w">
</span><span class="k">def</span> <span class="nf">non_repeating_substrings</span><span class="p">(</span><span class="n">text</span><span class="p">):</span><span class="w">
</span> <span class="n">start</span> <span class="o">=</span> <span class="mi">0</span><span class="w">
</span> <span class="k">for</span> <span class="n">index</span><span class="p">,</span> <span class="n">c</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">text</span><span class="p">):</span><span class="w">
</span> <span class="n">found</span> <span class="o">=</span> <span class="n">text</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="n">c</span><span class="p">,</span> <span class="n">start</span><span class="p">,</span> <span class="n">index</span><span class="p">)</span><span class="w">
</span> <span class="k">if</span> <span class="n">found</span> <span class="o"><</span> <span class="mi">0</span><span class="p">:</span><span class="w">
</span> <span class="k">continue</span><span class="w">
</span> <span class="k">yield</span> <span class="n">index</span> <span class="o">-</span> <span class="n">start</span><span class="w">
</span> <span class="n">start</span> <span class="o">=</span> <span class="n">found</span> <span class="o">+</span> <span class="mi">1</span><span class="w">
</span> <span class="k">yield</span> <span class="nb">len</span><span class="p">(</span><span class="n">text</span><span class="p">)</span> <span class="o">-</span> <span class="n">start</span></code></pre>
<p>I soon received got a response telling me my application was declined:</p>
<blockquote>
<p>Thank you for your interest in our company.
Unfortunately your longest substring script did not pass our test.
Best of luck,</p>
</blockquote>
<p>Boy was this a surprise. 😲</p>
<p>Little did I know, I was being set up for an <em>epic prank</em>. 😜</p>
<p>I signed up on some leetcode website and tested my solution there. But it passed all their tests. 🤔</p>
<p>Fortunately, this company was kind enough to tell me the inputs my submission failed on when I asked them:</p>
<blockquote>
<p>Please test it yourself:</p>
<p>Expected 2 but got 1 for ‘aab’</p>
<p>Expected 3 but got 2 for ‘abac’</p>
<p>Expected 3 but got 2 for ‘ababc’</p>
<p>Expected 3 but got 1 for ‘aabcc’</p>
<p>Expected 3 but got 2 for ‘abacadaeafag’</p>
</blockquote>
<p>After getting these details, I tested my program on the inputs they provided.
But the outputs I got were the <em>expected</em> values, not the values they say they
got. 🤯</p>
<p>Golly, 🤪 I was baffled.</p>
<p>Maybe they inadvertently modified my program somehow when copying it to be tested. 🤨
But I wasn’t sure what change would cause the results they saw.
And I sent a copy of my solution in my first reply to them just to be certain they were running the right thing. 🤓👍 ✅</p>
<p>Maybe they were using Python 2.7 and something was different. 🤔
But I ran the tests on that version and they passed there too. 😵💫</p>
<p>I explained to them that I tested my solution with the inputs they provided and that it #WorksForMe🤣🤣🤣 but, unsurprisingly, I didn’t hear back.</p>
<p>That was a couple weeks ago. But I still wondered what caused this.</p>
<p>Until today when I happened to look at their output again.
I quickly realized they just replaced <code>max</code> with <code>min</code> in my definition of
<code>longest_non_repeating_substring</code>. As to return the length of the <em>shortest</em>
substring without repeating characters. 😆😂😆😂😆😂</p>
<p>What I sent:</p>
<pre class="code python full-width literal-block"><code><span class="k">def</span> <span class="nf">longest_non_repeating_substring</span><span class="p">(</span><span class="n">text</span><span class="p">):</span><span class="w">
</span> <span class="k">return</span> <span class="nb">max</span><span class="p">(</span><span class="n">non_repeating_substrings</span><span class="p">(</span><span class="n">text</span><span class="p">))</span></code></pre>
<p>What they ran:</p>
<pre class="code python full-width literal-block"><code><span class="k">def</span> <span class="nf">longest_non_repeating_substring</span><span class="p">(</span><span class="n">text</span><span class="p">):</span><span class="w">
</span> <span class="k">return</span> <span class="nb">min</span><span class="p">(</span><span class="n">non_repeating_substrings</span><span class="p">(</span><span class="n">text</span><span class="p">))</span></code></pre>
<p>I had been #pranked! 🤣💀💀💀</p>
<p>Next time you’re interviewing a candidate for a job, try making a simple change to their program like the one shown here.
You’re sure to both share a bounteous chuckle when they realize your clever deception. 💾💉💪😝</p>
<p>#unexpected #TryNotToLaugh</p>
2022-07-27T12:00:00-07:00https://froghat.ca/2022/01/elmElm2022-01-07T12:00:00-08:00sqwishy<p><a class="reference external" href="https://elm-lang.org/">Elm</a> is a functional programming language. It compiles to JavaScript and has a bunch of
libraries (like <a class="reference external" href="https://package.elm-lang.org/packages/elm/browser/latest/">elm/browser</a>) for building software to run in a web browser.</p>
<p>It has some generics. There’s an option type, <a class="reference external" href="https://package.elm-lang.org/packages/elm/core/latest/Maybe">Maybe</a>, and a <a class="reference external" href="https://package.elm-lang.org/packages/elm/core/latest/Result">Result</a> type.</p>
<p>This, along with the static type checking in the compiler, allows type expressions and
compile-time guarantees that plain JavaScript does not offer.</p>
<p>It’s super fun. And I’m going to say some nice things about it and then explain why I
won’t use it in the future.</p>
<section id="the-tooling-is-nice">
<h2>The tooling is nice.<a class="self-link" title="link to this section" href="#the-tooling-is-nice"></a></h2>
<p>The compiler (written in Haskell) is quite fast compared to my experiences with a
lot of JavaScript tooling or even the TypeScript compiler.</p>
<p>The output for compiler errors is user-friendly too. Here’s a type with three variants
and a function to return a CSS class corresponding to a given variant.</p>
<pre class="code elm full-width literal-block"><code><span class="kr">type</span><span class="w"> </span><span class="kt">Outcome</span><span class="w">
</span><span class="nf">=</span><span class="w"> </span><span class="kt">Whatever</span><span class="w">
</span><span class="nf">|</span><span class="w"> </span><span class="kt">Good</span><span class="w">
</span><span class="nf">|</span><span class="w"> </span><span class="kt">Bad</span><span class="w">
</span><span class="nv">outcomeClass</span><span class="w"> </span><span class="nf">:</span><span class="w"> </span><span class="kt">Outcome</span><span class="w"> </span><span class="nf">-></span><span class="w"> </span><span class="kt">String</span><span class="w">
</span><span class="nv">outcomeClass</span><span class="w"> </span><span class="nv">outcome</span><span class="w"> </span><span class="nf">=</span><span class="w">
</span><span class="kr">case</span><span class="w"> </span><span class="nv">outcome</span><span class="w"> </span><span class="kr">of</span><span class="w">
</span><span class="kt">Good</span><span class="w"> </span><span class="nf">-></span><span class="w">
</span><span class="s">"yay"</span><span class="w">
</span><span class="kt">Bad</span><span class="w"> </span><span class="nf">-></span><span class="w">
</span><span class="s">"nay"</span></code></pre>
<p>And this is what the compiler says about this code where a variant is not handled by the
case statement.</p>
<pre class="code full-width literal-block"><code>Detected problems in 1 module.
-- MISSING PATTERNS ----------------------------------------------- src/Roll.elm
This `case` does not have branches for all possibilities:
291|> case outcome of
292|> Good ->
293|> "yay"
294|>
295|> Bad ->
296|> "nay"
Missing possibilities include:
Whatever
I would have to crash if I saw one of those. Add branches for them!
Hint: If you want to write the code for each branch later, use `Debug.todo` as a
placeholder. Read <https://elm-lang.org/0.19.1/missing-patterns> for more
guidance on this workflow.</code></pre>
<p>And, if that’s not enough kool-aid, there’s tool called <a class="reference external" href="https://github.com/avh4/elm-format">elm-format</a> that formats your
source code. Even making nice whitespace adjustments and removing unneeded parenthesis.
For example:</p>
<pre class="code elm literal-block"><code><span class="nv">thing</span><span class="w"> </span><span class="nf">=</span><span class="w"> </span><span class="p">(</span><span class="nv">foo</span><span class="w"> </span><span class="nv">bar</span><span class="p">)</span><span class="w"> </span><span class="nv">baz</span></code></pre>
<p>… becomes …</p>
<pre class="code elm literal-block"><code><span class="nv">thing</span><span class="w"> </span><span class="nf">=</span><span class="w">
</span><span class="nv">foo</span><span class="w"> </span><span class="nv">bar</span><span class="w"> </span><span class="nv">baz</span></code></pre>
<p>It’s great! We’re living in the future.</p>
</section><section id="and-i-really-like-the-way-it-looks">
<h2>And I really like the way it looks.<a class="self-link" title="link to this section" href="#and-i-really-like-the-way-it-looks"></a></h2>
<pre class="code elm literal-block"><code><span class="c1">-- from https://elm-lang.org/examples/buttons</span><span class="w">
</span><span class="nv">view</span><span class="w"> </span><span class="nf">:</span><span class="w"> </span><span class="kt">Model</span><span class="w"> </span><span class="nf">-></span><span class="w"> </span><span class="kt">Html</span><span class="w"> </span><span class="kt">Msg</span><span class="w">
</span><span class="nv">view</span><span class="w"> </span><span class="nv">model</span><span class="w"> </span><span class="nf">=</span><span class="w">
</span><span class="nv">div</span><span class="w"> </span><span class="p">[]</span><span class="w">
</span><span class="p">[</span><span class="w"> </span><span class="nv">button</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="nv">onClick</span><span class="w"> </span><span class="kt">Decrement</span><span class="w"> </span><span class="p">]</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="nv">text</span><span class="w"> </span><span class="s">"-"</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="nv">div</span><span class="w"> </span><span class="p">[]</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="nv">text</span><span class="w"> </span><span class="p">(</span><span class="kt">String</span><span class="nf">.</span><span class="nv">fromInt</span><span class="w"> </span><span class="nv">model</span><span class="p">)</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="nv">button</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="nv">onClick</span><span class="w"> </span><span class="kt">Increment</span><span class="w"> </span><span class="p">]</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="nv">text</span><span class="w"> </span><span class="s">"+"</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">]</span></code></pre>
<p>I’m a fan of indentation and whitespace being significant in language syntax.
Not everyone will agree; but, <code>{}</code> brackets is punctuation to scan
through and indentation already serves the purpose.</p>
<p>But, being new to functional programming, it took some time for me to figure out how to read type
signatures and function composition in Elm without a bunch of punctuation everywhere.</p>
<p>For example, a type in Rust (or C++ templates I think?) might resemble <code>List<Foo></code> or
<code>Result<Foo, Error></code>. But something similar in Elm might be <code>List Foo</code> or <code>Result Error
Foo</code>.</p>
<aside class="aside">
<p>That’s what is going on with the <code>Html Msg</code> type in the example above. <code>Html</code> is
generic over some some parameter that we’ve given as <code>Msg</code> in that case.</p>
</aside>
<p>Similarly, function calls read without commas delimiting parameters because functions
only really take one argument. If you want to do something with two parameters, like
make a new string by joining two input strings, you might have a a function that takes
the <em>first</em> string and returns a function that takes the <em>second</em> string and then
returns a new string by joining the two inputs.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>I think defining a function like this called <a class="reference external" href="https://en.wikipedia.org/wiki/Currying">currying</a>.</p>
</aside>
</aside>
</aside>
<p>It might look like:</p>
<pre class="code elm full-width literal-block"><code><span class="nv">wow</span><span class="w"> </span><span class="nf">:</span><span class="w"> </span><span class="kt">String</span><span class="w"> </span><span class="nf">-></span><span class="w"> </span><span class="kt">String</span><span class="w"> </span><span class="nf">-></span><span class="w"> </span><span class="kt">String</span><span class="w">
</span><span class="nv">wow</span><span class="w"> </span><span class="nv">a</span><span class="w"> </span><span class="nv">b</span><span class="w"> </span><span class="nf">=</span><span class="w">
</span><span class="nv">a</span><span class="w"> </span><span class="nf">++</span><span class="w"> </span><span class="s">" "</span><span class="w"> </span><span class="nf">++</span><span class="w"> </span><span class="nv">b</span></code></pre>
<p>… and could be invoked with …</p>
<pre class="code elm full-width literal-block"><code><span class="nv">wow</span><span class="w"> </span><span class="s">"hello"</span><span class="w"> </span><span class="s">"world"</span></code></pre>
<p>… which is the same as …</p>
<pre class="code elm full-width literal-block"><code><span class="p">(</span><span class="nv">wow</span><span class="w"> </span><span class="s">"hello"</span><span class="p">)</span><span class="w"> </span><span class="s">"world"</span></code></pre>
<p>… because the <code>wow "hello"</code> expression evaluates to a function and everything is just
partial applications and order of operations.</p>
<p>The order of operations thing is a big deal because you can use operators with different
precedence to avoid parenthesis everywhere.</p>
</section><section id="method-chaining">
<h2>Method Chaining<a class="self-link" title="link to this section" href="#method-chaining"></a></h2>
<p>The <a class="reference external" href="https://package.elm-lang.org/packages/elm/core/latest/Basics#|%3E">|></a> operator is great. It kinda lets you do <a class="reference external" href="https://en.wikipedia.org/wiki/Method_chaining">method chaining</a> without methods.</p>
<p>Look at this Rust code (taken from docs for <a class="reference external" href="https://docs.rs/hyper/0.13.10/hyper/client/struct.Builder.html">an HTTP library called hyper</a>) that uses
method chaining to initialize an HTTP client.</p>
<pre class="code rust full-width literal-block"><code><span class="kd">let</span><span class="w"> </span><span class="n">client</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">Client</span>::<span class="n">builder</span><span class="p">()</span><span class="w">
</span><span class="p">.</span><span class="n">pool_idle_timeout</span><span class="p">(</span><span class="n">Duration</span>::<span class="n">from_secs</span><span class="p">(</span><span class="mi">30</span><span class="p">))</span><span class="w">
</span><span class="p">.</span><span class="n">http2_only</span><span class="p">(</span><span class="kc">true</span><span class="p">)</span><span class="w">
</span><span class="p">.</span><span class="n">build_http</span><span class="p">();</span></code></pre>
<p>You don’t care about how it works<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>, only that it looks cool.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p><code>Client::builder()</code> returns a <code>Builder</code>
instance and each method takes and returns <code>&mut self</code>, a mutable reference to the <code>Builder</code>
instance. So you can write calls to <code>Builder</code> methods as a way to build up some
configuration on a single object before finally using it to seed initialization of a
<code>Client</code> by calling <code>build_http()</code>.</p>
</aside>
</aside>
</aside>
<p>This is also a popular thing with the <a class="reference external" href="https://doc.rust-lang.org/nightly/core/iter/trait.Iterator.html">Iterator</a> trait in Rust.</p>
<pre class="code rust literal-block"><code><span class="kd">let</span><span class="w"> </span><span class="n">s</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="s">"alpha"</span><span class="p">,</span><span class="w"> </span><span class="s">"beta"</span><span class="p">,</span><span class="w"> </span><span class="s">"gamma"</span><span class="p">]</span><span class="w">
</span><span class="p">.</span><span class="n">iter</span><span class="p">()</span><span class="w">
</span><span class="p">.</span><span class="n">map</span><span class="p">(</span><span class="o">|</span><span class="n">s</span><span class="o">|</span><span class="w"> </span><span class="n">s</span><span class="p">.</span><span class="n">chars</span><span class="p">())</span><span class="w">
</span><span class="p">.</span><span class="n">flatten</span><span class="p">()</span><span class="w">
</span><span class="p">.</span><span class="n">collect</span>::<span class="o"><</span><span class="nb">String</span><span class="o">></span><span class="p">();</span><span class="w">
</span><span class="fm">assert_eq!</span><span class="p">(</span><span class="n">s</span><span class="p">,</span><span class="w"> </span><span class="s">"alphabetagamma"</span><span class="p">);</span></code></pre>
<p>However, this syntax requires methods, not just any function you have lying around.</p>
<p>If you want to want to put something into the chain that isn’t a method you have to make
it a method. In Rust, this is possible by creating a trait but it is mildly awkward.</p>
<p>In Elm, there are no methods, but you can do some cool pipelining. Here are three
expressions that do the same thing.</p>
<pre class="code elm full-width literal-block"><code><span class="s">"13, meow, 5, 7, -20"</span><span class="w">
</span><span class="nf">|></span><span class="w"> </span><span class="kt">String</span><span class="nf">.</span><span class="nv">split</span><span class="w"> </span><span class="s">","</span><span class="w">
</span><span class="nf">|></span><span class="w"> </span><span class="kt">List</span><span class="nf">.</span><span class="nv">filterMap</span><span class="w"> </span><span class="p">(</span><span class="kt">String</span><span class="nf">.</span><span class="nv">trim</span><span class="w"> </span><span class="nf">>></span><span class="w"> </span><span class="kt">String</span><span class="nf">.</span><span class="nv">toInt</span><span class="p">)</span><span class="w">
</span><span class="nf">|></span><span class="w"> </span><span class="kt">List</span><span class="nf">.</span><span class="nv">filter</span><span class="w"> </span><span class="p">(</span><span class="nf">(<)</span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w">
</span><span class="nf">|></span><span class="w"> </span><span class="kt">List</span><span class="nf">.</span><span class="nv">sum</span><span class="w">
</span><span class="nf">|></span><span class="w"> </span><span class="kt">String</span><span class="nf">.</span><span class="nv">fromInt</span><span class="w">
</span><span class="nf">|></span><span class="w"> </span><span class="nv">text</span><span class="w">
</span><span class="kt">String</span><span class="nf">.</span><span class="nv">fromInt</span><span class="w">
</span><span class="p">(</span><span class="kt">List</span><span class="nf">.</span><span class="nv">sum</span><span class="w">
</span><span class="p">(</span><span class="kt">List</span><span class="nf">.</span><span class="nv">filter</span><span class="w">
</span><span class="p">(</span><span class="nf">\</span><span class="nv">n</span><span class="w"> </span><span class="nf">-></span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="nf"><</span><span class="w"> </span><span class="nv">n</span><span class="p">)</span><span class="w">
</span><span class="p">(</span><span class="kt">List</span><span class="nf">.</span><span class="nv">filterMap</span><span class="w">
</span><span class="p">(</span><span class="nf">\</span><span class="nv">s</span><span class="w"> </span><span class="nf">-></span><span class="w"> </span><span class="kt">String</span><span class="nf">.</span><span class="nv">toInt</span><span class="w"> </span><span class="p">(</span><span class="kt">String</span><span class="nf">.</span><span class="nv">trim</span><span class="w"> </span><span class="nv">s</span><span class="p">))</span><span class="w">
</span><span class="p">(</span><span class="kt">String</span><span class="nf">.</span><span class="nv">split</span><span class="w"> </span><span class="s">","</span><span class="w"> </span><span class="s">"13, meow, 5, 7, -20"</span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w">
</span><span class="kt">String</span><span class="nf">.</span><span class="nv">fromInt</span><span class="w">
</span><span class="nf"><<</span><span class="w"> </span><span class="kt">List</span><span class="nf">.</span><span class="nv">sum</span><span class="w">
</span><span class="nf"><<</span><span class="w"> </span><span class="kt">List</span><span class="nf">.</span><span class="nv">filter</span><span class="w"> </span><span class="p">(</span><span class="mi">0</span><span class="w"> </span><span class="nf">|></span><span class="w"> </span><span class="nf">(<)</span><span class="p">)</span><span class="w">
</span><span class="nf"><<</span><span class="w"> </span><span class="kt">List</span><span class="nf">.</span><span class="nv">filterMap</span><span class="w"> </span><span class="p">(</span><span class="kt">String</span><span class="nf">.</span><span class="nv">toInt</span><span class="w"> </span><span class="nf"><<</span><span class="w"> </span><span class="kt">String</span><span class="nf">.</span><span class="nv">trim</span><span class="p">)</span><span class="w">
</span><span class="nf"><<</span><span class="w"> </span><span class="kt">String</span><span class="nf">.</span><span class="nv">split</span><span class="w"> </span><span class="s">","</span><span class="w">
</span><span class="nf"><|</span><span class="w">
</span><span class="s">"13, meow, 5, 7, -20"</span></code></pre>
<p>You can read it forwards, or backwards, or like a pyramid, or mix and match all of
these.</p>
<p>There’s nothing special about the functions involved – you don’t have to write
a trait to introduce something into the chain. It’s just partial function application.</p>
<p>You don’t need parenthesis or nesting everywhere and the statements compose into
like a pipeline or something.</p>
<p>I think it looks great and makes me happy when I read it.</p>
<section id="bonus-meme">
<h3>Bonus Meme<a class="self-link" title="link to this section" href="#bonus-meme"></a></h3>
<p>Also, method chaining irritates Robert Cecil Martin who writes:</p>
<blockquote>
<pre class="code java full-width literal-block"><code><span class="kd">final</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">outputDir</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ctxt</span><span class="p">.</span><span class="na">getOptions</span><span class="p">().</span><span class="na">getScratchDir</span><span class="p">().</span><span class="na">getAbsolutePath</span><span class="p">();</span></code></pre>
<p>This kind of code is often called a train wreck because it look like a bunch of
coupled train cars. Chains of calls like this are generally considered to be sloppy
style and should be avoided. It is usually best to split them up as follows:</p>
<pre class="code java full-width literal-block"><code><span class="n">Options</span><span class="w"> </span><span class="n">opts</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">ctxt</span><span class="p">.</span><span class="na">getOptions</span><span class="p">();</span><span class="w">
</span><span class="n">File</span><span class="w"> </span><span class="n">scratchDir</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">opts</span><span class="p">.</span><span class="na">getScratchDir</span><span class="p">();</span><span class="w">
</span><span class="kd">final</span><span class="w"> </span><span class="n">String</span><span class="w"> </span><span class="n">outputDir</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">scratchDir</span><span class="p">.</span><span class="na">getAbsolutePath</span><span class="p">();</span></code></pre>
<footer class="attribution">—<em>Clean Code</em> by Robert Cecil Martin</footer>
</blockquote>
<p>The more I read this, the better it gets.</p>
<p>“This … is often called a train wreck”. Who does this?</p>
<p>“… it look like a bunch of coupled train cars.” Okay, lets say it looks like train
cars. How do you get from “coupled train cars” to a “train wreck”?</p>
<p>Now, Robert probably didn’t spend much time on this part of the book – and I hate to be
an anti-fan – but the suggestion is a little funny to me.</p>
<p>We introduce two variables in our scope, <code>opts</code> and <code>scratchDir</code>. Now, if I were to have
<code>scratchDir</code> and <code>outputDir</code> in scope I would seriously consider making them the same
type or using dissimilar names. Personally, I can see myself having a hard time
remembering if <code>scratchDir</code> is a file handle or a path like <code>outputDir</code> is. And having
memorable names is important.</p>
<p>I mean, in this case, it’s kind of silly because <code>opts</code> and <code>scratchDir</code> aren’t used
later on – so their names don’t have to be meaningful – but, that fact may not be
apparent to someone reading this for the first time.</p>
<p>So I don’t know what Robert is thinking here. At this point, I’m pretty sure he
just really doesn’t like trains.</p>
</section></section><section id="i-made-a-thing">
<h2>I made a thing.<a class="self-link" title="link to this section" href="#i-made-a-thing"></a></h2>
<p>A bit ago I used Elm to make a webshit for rolling dice. You can check out the project
on <a class="reference external" href="https://git.sr.ht/~sqwishy/super-dicey-die-roller/">sr.ht</a> or I might still be hosting a copy of the webshit at <a class="reference external" href="https://dice.froghat.ca">dice.froghat.ca</a>.</p>
<p>Later on, I tried to use Elm for another project.</p>
</section><section id="the-problem-i-had-with-elm">
<h2>The problem I had with Elm.<a class="self-link" title="link to this section" href="#the-problem-i-had-with-elm"></a></h2>
<p>The author of Elm, Evan Czaplicki, gave <a class="reference external" href="https://youtu.be/DSjbTC-hvqQ?t=1152">a talk at some point somewhere</a> (elm-conf
2016?) where he explained that the way he does project management is different from how
it normally is done.</p>
<p>Normally, he explains, projects aim for zero open issues. So, as issues comes in,
they’re addressed individually. One at a time, each issue corresponds to some
resolution like a change in the project.</p>
<p>But, for Evan’s projects, issues come in and are left to marinate and to be pondered over
by the wise such that they can be considered holistically. In this way, connections between
multiple issues can be found and then addressed at once and by comparatively fewer
changes overall.</p>
<p>Moreover, by ignoring some issues outright, releases are simpler because they don’t
change as much. And, by releasing infrequently, there is less churn and projects
that depend on ours don’t have to spend as much time upgrading.</p>
<p>Evan emphasizes that his way is a weird and quirky way of doing things. But I don’t
think anybody really does the first thing.</p>
<p>People have lives and, when a new issue comes in for a project they work on, it’s quite
normal for something else to be more immediately important. Even when you aren’t
bouncing between projects, and it’s your full time job, and you have help, a backlog is
a totally normal thing just because doing stuff takes more time than complaining about
it.</p>
<p>And, I don’t know that it makes sense to treat bugs and features in the same way here.
Sometimes, a bug has a pretty clear and obvious solution. You can make one change for
it that doesn’t break any APIs and it doesn’t need to marinate.</p>
<p>Also, when I hear Evan describing this process, it sounds like he’s speaking on the time
scale of weeks. Maybe months for slower projects or for complicated features.</p>
<p>Last year, I stumbled into and reported <a class="reference external" href="https://github.com/elm/virtual-dom/issues/175">a bug</a> with the <a class="reference external" href="https://package.elm-lang.org/packages/elm/virtual-dom/latest/VirtualDom">elm/virtual-dom</a> package. But I
don’t think Evan has been active on that package since 2018. Maybe my issue, and the
others that have been reported to Elm and left open over the years, are marinating.
Wow, what a wacky and unique process.</p>
</section><section id="why-don-t-you-fix-it">
<h2>Why don’t you fix it?<a class="self-link" title="link to this section" href="#why-don-t-you-fix-it"></a></h2>
<blockquote>
<p>Elm does not have a traditional Foreign Function Interface with JavaScript.</p>
<footer class="attribution">—<a class="reference external" href="https://guide.elm-lang.org/interop/limits.html">https://guide.elm-lang.org/interop/limits.html</a></footer>
</blockquote>
<p>Even though Elm compiles to JavaScript, you can’t just invoke any arbitrary JavaScript
from Elm.</p>
<p>Some packages, like elm/virtual-dom, depend on specific JavaScript APIs, like
<a class="reference external" href="https://developer.mozilla.org/en-US/docs/Web/API/Document/createElement">Document.createElement</a>, in order to function. These packages include some special
JavaScript code to translate between the two languages. But the compiler will not
build packages with the JavaScript sources required for the translation unless they fall
under the “elm” or “elm-explorations” namespaces.<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a> That’s just how it’s designed.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>See <a class="reference external" href="https://github.com/elm/compiler/blob/0.19.1/builder/src/Build.hs#L305">builder/src/Build.hs</a>
and <a class="reference external" href="https://github.com/elm/compiler/blob/0.19.1/compiler/src/Elm/Package.hs#L83">compiler/src/Elm/Package.hs</a>.</p>
</aside>
</aside>
</aside>
<p>As a result of this special treatment, users cannot fork elm/virtual-dom and publish
their own version in their own namespace.</p>
<p>You can hack around this by messing with the compiler or with how packages are resolved
on your system,<a class="footnote-reference superscript" href="#footnote-4" id="footnote-reference-4" role="doc-noteref"><span class="fn-bracket">[</span>4<span class="fn-bracket">]</span></a> but that limits your ability to share your work because others then
need the same hacks.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-4" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-4">4</a><span class="fn-bracket">]</span></span>
<p>Some people have already had to do <a class="reference external" href="https://github.com/robinheghan/elm-git-install">this sort of thing</a> so they can use packages
that not published publicly.</p>
</aside>
</aside>
</aside>
<p>I won’t going to argue with the motivations stated in the document linked above on guide.elm-lang.org. But I
think it’s fair compare Elm to Go a little bit. Even though Go doesn’t compile to another
language like Elm does, they both have a runtime that makes interoperability a little
bit more complicated. And, I think the points in that document can, to some degree, be
argued for both Elm and Go.</p>
<p>Nevertheless, Go has its <a class="reference external" href="https://go.dev/blog/cgo">Cgo</a> thing that lets Go code call into C.</p>
<p>Now, my Go lore is spotty, but it seems as if Evan’s concerns about
package flooding and safety haven’t shown up in Go’s ecosystem. Instead, the community
demonstrates a preference for “pure Go” implementations and bears attitudes that
correspond to the kind of thing that Evan seems to have been trying to reach. But Go
didn’t neuter their toolchain to get there.</p>
<p>To be clear, it is Evan’s <em>right</em> to do whatever he wants with Elm. But I think
that Go’s approach has been more <em>successful</em> at providing utility and even at avoiding
the pitfalls that Evan was trying to avoid when he made FFI a privilege denied to Elm’s
users.</p>
<p>In the end, the lack of an FFI<a class="footnote-reference superscript" href="#footnote-5" id="footnote-reference-5" role="doc-noteref"><span class="fn-bracket">[</span>5<span class="fn-bracket">]</span></a> in Elm prevented me from doing what I needed to finish a
project. I switched the project over to Rust targeting WebAssembly and was able to
release it.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-5" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-5">5</a><span class="fn-bracket">]</span></span>
<p>The elm/virtual-dom issue was one of several complications that required a proper
FFI to solve.</p>
</aside>
</aside>
</aside>
<p>Compared to a lot of other options, Elm has pleasant tooling, an okayer community,
and great documentation. And I think it’s a very fun language. But, the fact that
Elm’s community packages don’t get to play by the same rules as Evan’s, demonstrates his
lack of belief in the creative potential of his community
and comes at the cost of missed opportunities for Elm itself.</p>
</section><p><a class="reference external" href="https://elm-lang.org/">Elm</a> is a fun programming language that I really liked a lot and that I gave
up on last year.</p>
2022-01-07T12:00:00-08:00https://froghat.ca/2022/02/rewriting-froghat.caRewriting froghat.ca2022-02-24T12:00:00-08:00sqwishy<p>I rewrote my static site generator again. But only a little bit.</p>
<p>My goal was to make things simpler and easier to read and understand. And maybe make
things a bit faster too.</p>
<p>This is the output of cloc (a program that counts lines of code) before …</p>
<table>
<tbody>
<tr><td><p>Language</p></td>
<td><p>files</p></td>
<td><p>blank</p></td>
<td><p>comment</p></td>
<td><p>code</p></td>
</tr>
<tr><td><p>Python</p></td>
<td><p>17</p></td>
<td><p>517</p></td>
<td><p>169</p></td>
<td><p>1595</p></td>
</tr>
<tr><td><p>C</p></td>
<td><p>1</p></td>
<td><p>57</p></td>
<td><p>9</p></td>
<td><p>167</p></td>
</tr>
<tr><td><p>SUM:</p></td>
<td><p>18</p></td>
<td><p>574</p></td>
<td><p>178</p></td>
<td><p>1762</p></td>
</tr>
</tbody>
</table>
<p>… and after …</p>
<table>
<tbody>
<tr><td><p>Language</p></td>
<td><p>files</p></td>
<td><p>blank</p></td>
<td><p>comment</p></td>
<td><p>code</p></td>
</tr>
<tr><td><p>Python</p></td>
<td><p>12</p></td>
<td><p>494</p></td>
<td><p>163</p></td>
<td><p>1332</p></td>
</tr>
<tr><td><p>C</p></td>
<td><p>1</p></td>
<td><p>24</p></td>
<td><p>2</p></td>
<td><p>73</p></td>
</tr>
<tr><td><p>SUM:</p></td>
<td><p>13</p></td>
<td><p>518</p></td>
<td><p>165</p></td>
<td><p>1405</p></td>
</tr>
</tbody>
</table>
<aside class="aside">
<p>cloc has a <code>--diff</code> mode specifically for this kind of comparison but it looks weird</p>
</aside>
<p>So that’s nice. Although maybe it’s more dumb if you include dependencies
in the standard library and from pypi or wherever.</p>
<p>There were three fun little changes.</p>
<section id="proc-pid-cmdline">
<h2>/proc/[pid]/cmdline<a class="self-link" title="link to this section" href="#proc-pid-cmdline"></a></h2>
<p>For reasons, the site generator uses <a class="reference external" href="https://ninja-build.org/">ninja</a> to execute build steps in order and rerun
steps when dependencies are old or missing. But starting Python for each step is slow.</p>
<p>Instead, a C program starts and connects to the main Python process and sends its
command line arguments and stdin, stdout, and stderr file descriptors. The main Python
process uses those file descriptors in place of its own to pretend like it’s this other
program and so ninja captures the output. It does the thing for the arguments and then
replies to the C program with the code to exit with.</p>
<p>Previously, data was serialized over the socket as <a class="reference external" href="https://en.wikipedia.org/wiki/External_Data_Representation">XDR</a> because it’s compact, easy to implement
in C, and Python has an implementation in the standard library, <code>xdrlib</code>.</p>
<aside class="aside">
<p>Recently I saw <a class="reference external" href="https://man.archlinux.org/man/pidfd_open.2.en">pidfd_open</a> in Linux. It lets you get a file descriptor for a
process that you can poll on to check if it’s okay. Or you can use it with
<a class="reference external" href="https://man.archlinux.org/man/pidfd_getfd.2.en">pidfd_getfd</a> to duplicate a file descriptor from another process into your own.
It’s really overpowered compared to SCM_RIGHTS. Linux devs clearly don’t care about
balance and just want people to get the new DLC. And I would have used it, to keep
up with the meta, but couldn’t becase Python doesn’t have <code>os.pidfd_getfd</code> yet.
:(</p>
</aside>
<p>I noticed it was possible to get the process id (pid) of a peer connected over a unix
socket with <code>getsockopt</code>’s <code>SO_PEERCRED</code> option. From this, we can read the program’s
command line arguments from the file at <code>/proc/[pid]/cmdline</code>.</p>
<p>This means we don’t need to send the arguments over the socket. So I managed to
throw out the whole XDR thing because the only value sent over the socket now (other
than the file descriptors) is the return code.</p>
<p>So that’s why the C program is more smol.</p>
</section><section id="null-publish">
<h2>null_publish()<a class="self-link" title="link to this section" href="#null-publish"></a></h2>
<p>The documents for my site are written with reStructuredText markup. <code>docutils</code> is the
package that the site generator uses to parse that markup and render HTML. It’s
relatively slow.</p>
<p>To go relatively fast, the site generator will fork and do this step in a child process so that the
main process can continue handling requests for more build steps from ninja.</p>
<p>One drawback of this is that caching done by child processes cannot benefit subsequent
build steps because the child just quits when it’s done. This is bad particularly in
Python, where module imports can happen any whenever, or with pygments, where syntax
highlighting definitions are loaded on demand.</p>
<p>Aside from the import, a lot of the caching that benefits docutils is done by
the regular expression package in Python’s standard library. Regular expression
patterns are compiled when they are first used each. Subsequent uses are much faster
because the regular expression package will use the compiled pattern from the first
time.</p>
<p>Previously, I had read through the docutils source code to find some regular expression
pattern strings that seemed easy to reference. Before forking, I’d import docutils and
pass those strings to <code>re.compile()</code> in order to cache them for when they’d be used
later after a fork.</p>
<p>It occurred to me to ask docutils to parse and render an empty document instead.
It was less hacky and a bit more effective than what I was doing before.</p>
<p>Rendering an empty document takes about <em>15 milliseconds</em> the first time and just <em>one
and a half milliseconds</em> subsequently.
Importing and configuring docutils and publishing the empty document takes about <em>70
milliseconds</em> the first time and like <em>two milliseconds</em> subsequently.</p>
<p>And those tens of milliseconds add up to several tens of milliseconds over the course
of the entire build. Since I do a full rebuild before publishing the site, and given
the rate I publish new posts – twenty posts in about three years – it adds up to about
<em>one second</em> saved <em>every three years</em>.</p>
<p>Which <em>sounds</em> not great for the hours spent on this. But, keep in mind that the time
you spend on meaningless optimisation is time <em>not</em> spent being self-aware and vividly
alert to your inadequacies and personal faults and the mistakes you’ve made in your
past and present relationships and decisions and how it’s all lead to the circumstances
that will haunt you through your mortal life in this hell we call reality.</p>
</section><section id="syntax-highlighting-with-syntect">
<h2>syntax highlighting with syntect<a class="self-link" title="link to this section" href="#syntax-highlighting-with-syntect"></a></h2>
<p>The forking approach still makes syntax highlighting slower overall.</p>
<p>Normally, syntax definitions are just loaded once and saved for when the same syntax is
highlighted later.</p>
<p>For example, highlighting a Python code block requires loading some information about
how to highlight Python in particular. But we only pay that cost the first time we
highlight Python code because we don’t have to load it again next time we see some
Python code.</p>
<p>Except in this forking model. After a child processes loads the syntax highlighting
definitions and uses it in a single document, it exits.</p>
<p>I looked into using a Rust library called <a class="reference external" href="https://github.com/trishume/syntect/">syntect</a> to do syntax highlighting. It’s
pretty interesting as it stores a lot of syntax highlighting definitions in a
compact binary format. It takes my laptop a bit under 25ish milliseconds to
load them. Preloading all the syntax definitions here is certainly more
feasible than doing so in pygments.</p>
<p>I made <a class="reference external" href="https://git.sr.ht/~sqwishy/python-syntect-meme">some Python bindings</a> for them real quick to try it out, and <a class="reference external" href="https://git.sr.ht/~sqwishy/python-syntect-meme/tree/1c91536b/item/examples/furret_code_block.py">a docutils
directive</a> to use syntect for highlighting instead of pygments. It performed better but
the built in syntax definitions were disappointing.</p>
<p>Specifically, I was missing definitions for Elm, GraphQL, INI, nginx, and TOML. Also,
it had definitions for Clojure but they didn’t work so I had to provide those too. This
was not quite what I expected.</p>
<p>Also, the output it generates seems a little more verbose than what pygments will
produce. The class names of the highlighting spans in the HTML it outputs are longer
and it seems there’s a lot of duplication. It’s not a big deal but it’s a thing.</p>
<p>To some extent, I didn’t like adding a dependency that required Rust. Mostly, I was put
off by having to provide my own syntax highlighting definitions. It’s kind of a
trade-off between the convenience of pygments just working and a weird benchmarking kink
I enjoy for my own amusement.</p>
<p>But syntect is quite popular and used in other projects like <a class="reference external" href="https://github.com/sharkdp/bat">bat</a> which I know supports
a bunch of the languages that I mentioned were missing in syntect. So doing what they
do might be worth looking into.</p>
</section><p>I rewrote my static site generator again. But only a little bit.</p>
2022-02-24T12:00:00-08:00https://froghat.ca/2021/02/2020-review2020 Review2021-02-27T12:00:00-08:00sqwishy<p>Twenty-twenty was an average year.</p>
<p>Here’s a chart comparing it to other years.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="years-dm.svg">
<img alt="A bar graph of years 2017 through 2023 on both x-axis and y-axis." src="years.svg" />
</picture>
<p>You can see it’s right in the middle.</p>
<aside class="aside">
<p>Here’s the table listing the data in the figure above.</p>
<table>
<thead>
<tr><th class="head"><p>Year</p></th>
<th class="head"><p>Year</p></th>
</tr>
</thead>
<tbody>
<tr><td><p>2017</p></td>
<td><p>2017</p></td>
</tr>
<tr><td><p>2018</p></td>
<td><p>2018</p></td>
</tr>
<tr><td><p>2019</p></td>
<td><p>2019</p></td>
</tr>
<tr><td><p>2020</p></td>
<td><p>2020</p></td>
</tr>
<tr><td><p>2021</p></td>
<td><p>2021</p></td>
</tr>
<tr><td><p>2022</p></td>
<td><p>2022</p></td>
</tr>
<tr><td><p>2023</p></td>
<td><p>2023</p></td>
</tr>
</tbody>
</table>
<p>Don’t believe me? Boop this into your calculator at home and find out for yourself!</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="923cd7680a54b120f76b4bfa9a58073221ec9340-dm.svg">
<img alt="idk some weird math expression i copied from the wikipedia on a lorenz attractor" class="maths" src="923cd7680a54b120f76b4bfa9a58073221ec9340.svg" />
</picture>
</aside>
<section id="video-game">
<h2>Video Game<a class="self-link" title="link to this section" href="#video-game"></a></h2>
<p>The only notable thing was the release of an anticipated video game named <cite>Cyberpunk 2077</cite> which received much praise for being the most linear open world game ever produced.</p>
</section><section id="this-website">
<h2>This Website<a class="self-link" title="link to this section" href="#this-website"></a></h2>
<p>I rewrote my website generator. Still Python, but I no longer use Pelican, so it’s not available as a theme. But the source, released under public domain, is at <a class="reference external" href="https://git.sr.ht/~sqwishy/froghat.ca">git.sr.ht/~sqwishy/froghat.ca</a>. You can copy it and make something cooler if you want. (Like a Pelican theme.)</p>
</section><section id="internet">
<h2>Internet<a class="self-link" title="link to this section" href="#internet"></a></h2>
<p>There’s a new website out there on the internet called <cite>Twitter</cite>. Here’s a picture of the front page; maybe you’ve seen it?</p>
<img alt="A heading reads "Happening now", "Join Twitter today." Sign up and Log in buttons follow. Further down is a silouette of the Twitter logo bird against a grungy, distorted illustration of claustrophobic repeating text, in all caps, "what's happening"." src="hApPeNiNG-nOw-OmG.png" />
<p>Wow, it sure looks important.</p>
<p>I looked inside to find a happening and saw this banner at the bottom of a page.</p>
<img alt="Log in and sign up buttons adjacent to "Don't miss what's happening" "People on twitter are the first to know."" src="dont-miss-out-on-all-the-happenings-oh-no.png" />
<p>I couldn’t figure out what was happening, so I didn’t stay long. But it didn’t seem like this Twitter thing was encouraging people to have a healthy relationship with social media, so I don’t expect it to catch on.</p>
</section><p>Disclaimer: I received this product for free for the purpose of this review.</p>
2021-02-27T12:00:00-08:00https://froghat.ca/2021/09/reverse-reverse-nginxUsing nginx to reverse proxy the internet2021-09-22T12:00:00-07:00sqwishy<p><a class="reference external" href="https://nginx.org/">nginx</a> is a <a class="reference external" href="https://en.wikipedia.org/wiki/Reverse_proxy">reverse proxy</a>; often used to proxy incoming connections or requests at <em>edge</em>
of your network on their way from the internet to some service in your internal
network. One reason to do this is to <em>terminate</em> TLS at your edge with nginx
instead of at each service in your network.</p>
<p>However, we can also use the ngx_http_proxy_module to <em>initiate</em> TLS at nginx on its way
from our internal network to the internet.</p>
<section id="why">
<h2>Why?<a class="self-link" title="link to this section" href="#why"></a></h2>
<p>Because we want to open a WebSocket and keep it open between program restarts.</p>
<aside class="aside">
<p>This is a weird thing and it might not be a very good idea.</p>
</aside>
<p>Imagine we have a service that initiates a long-lived connection and it is not great
when the socket is closed for some reasons:</p>
<ul class="simple">
<li><p>The socket allows our service to listen to events broadcast by the peer and events
are missed when the connection is closed until it is reestablished.</p></li>
<li><p>Establishing a new connection requires starting TLS session that takes several hundred
milliseconds to complete. (Events are missed during this.)</p></li>
</ul>
<p>Our proper goal is to <strong>not lose events</strong>. That is, never have no connections open.</p>
<p>We could approach this in other ways, like with redundancy or by never restarting, but
that’s not what I did. So, in this scenario, we <em>enjoy</em> restarting things to update them
and we <em>don’t</em> want redundant machines or networks because <em>computers are bad</em>.</p>
</section><section id="how">
<h2>How?<a class="self-link" title="link to this section" href="#how"></a></h2>
<p>In a <a class="reference external" href="/2019/05/scm-rights">previous post about SCM_RIGHTS</a> I
illustrated duplicating a file descriptor to another process so that a connection can
outlive the process that created it.</p>
<p>Similarly, systemd’s <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/sd_pid_notify_with_fds.html">sd_pid_notify_with_fds</a> & <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/sd_listen_fds.html">sd_listen_fds</a> function can be used by a
systemd service (including a program started with <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/systemd-run.html">systemd-run</a>) to keep a file descriptor open
across a service restart.</p>
<p>When our service establishes this long-lived connection, we share the socket with
systemd so that we can get it back later after we restart.</p>
<p>For this to work reasonably, the program needs enough information about the open socket
to use it correctly when starting back up.</p>
<p>As it happens, after a clean exit, there is no state in our application worth retaining
having to do with either the service or the WebSocket protocol itself. The only state
that complicates things is the TLS connection that our WebSocket runs over. To solve
that, we avoid initiating the TLS connection in our process and, instead, delegate that
to nginx.</p>
</section><section id="proxy-pass-https">
<h2>proxy_pass https://…<a class="self-link" title="link to this section" href="#proxy-pass-https"></a></h2>
<p>Normally, our program opens a TLS connection to <code>example.com</code>. Instead, we want a
cleartext connection to nginx and a TLS connection from nginx to <code>example.com</code> through
which the request to open a WebSocket is proxied.</p>
<p>I run nginx on the same machine as my service, so I can connect to it at the hostname
<code>localhost</code>. But we’ll also include the proxied hostname as a subdomain so that nginx
knows where to proxy our request to – so <code>example.com.localhost</code> for instance.</p>
<p>The nginx configuration looks like:</p>
<pre class="code nginx literal-block"><code><span class="k">server</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="kn">listen</span><span class="w"> </span><span class="n">127.0.0.1</span><span class="p">:</span><span class="mi">80</span><span class="p">;</span><span class="w">
</span><span class="kn">listen</span><span class="w"> </span><span class="s">[::1]:80</span><span class="p">;</span><span class="w">
</span><span class="kn">server_name</span><span class="w"> </span><span class="p">~</span><span class="sr">^(?<meme>.+)\.localhost$;</span><span class="w">
</span><span class="s">resolver</span><span class="w"> </span><span class="mi">127</span><span class="s">.0.0.53</span><span class="p">;</span><span class="w">
</span><span class="kn">location</span><span class="w"> </span><span class="s">/</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="kn">proxy_pass</span><span class="w"> </span><span class="s">https://</span><span class="nv">$meme</span><span class="p">;</span><span class="w">
</span><span class="kn">proxy_ssl_server_name</span><span class="w"> </span><span class="no">on</span><span class="p">;</span><span class="w">
</span><span class="kn">proxy_ssl_protocols</span><span class="w"> </span><span class="s">TLSv1</span><span class="w"> </span><span class="s">TLSv1.1</span><span class="w"> </span><span class="s">TLSv1.2</span><span class="w"> </span><span class="s">TLSv1.3</span><span class="p">;</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span></code></pre>
<p>That’s mostly it for normal HTTP requests – it should let you fetch
<a class="reference external" href="http://www.wikipedia.org.localhost">http://www.wikipedia.org.localhost</a> – but it won’t upgrade WebSockets yet. And there a
few things to take note of:</p>
<ul class="simple">
<li><p>You may need or want to use something entirely different for your <a class="reference external" href="https://nginx.org/en/docs/http/ngx_http_core_module.html#resolver">resolver</a>.</p></li>
<li><p><a class="reference external" href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ssl_server_name">proxy_ssl_server_name</a> is important for connecting to servers that require the server
name to be included with TLS (<a class="reference external" href="https://en.wikipedia.org/wiki/Server_Name_Indication">SNI</a>). I think this is most CDNs & DDoS mitigation like
Cloudflare (a lot of the web…)</p></li>
<li><p><a class="reference external" href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ssl_protocols">proxy_ssl_protocols</a> is, at the time of writing, the default list of supported
protocols with the addition of TLSv1.3. Otherwise, nginx cannot connect to servers
that require TLSv1.3.</p></li>
</ul>
<!-- Continuing my habit of copying documentation from places and putting -->
<!-- it on my blog -->
<p>Finally, the <a class="reference external" href="http://nginx.org/en/docs/http/websocket.html">nginx documentation on WebSocket proxying</a> shows how we can extend the
configuration above to support for upgrading connections to WebSockets.</p>
<pre class="code nginx literal-block"><code><span class="k">map</span><span class="w"> </span><span class="nv">$http_upgrade</span><span class="w"> </span><span class="nv">$connection_upgrade</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="kn">default</span><span class="w"> </span><span class="s">upgrade</span><span class="p">;</span><span class="w">
</span><span class="kn">''</span><span class="w"> </span><span class="s">close</span><span class="p">;</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="k">server</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="kn">listen</span><span class="w"> </span><span class="n">127.0.0.1</span><span class="p">:</span><span class="mi">80</span><span class="p">;</span><span class="w">
</span><span class="kn">listen</span><span class="w"> </span><span class="s">[::1]:80</span><span class="p">;</span><span class="w">
</span><span class="kn">server_name</span><span class="w"> </span><span class="p">~</span><span class="sr">^(?<meme>.+)\.localhost$;</span><span class="w">
</span><span class="s">resolver</span><span class="w"> </span><span class="mi">127</span><span class="s">.0.0.53</span><span class="p">;</span><span class="w">
</span><span class="kn">location</span><span class="w"> </span><span class="s">/</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="kn">proxy_pass</span><span class="w"> </span><span class="s">https://</span><span class="nv">$meme</span><span class="p">;</span><span class="w">
</span><span class="kn">proxy_set_header</span><span class="w"> </span><span class="s">Connection</span><span class="w"> </span><span class="nv">$connection_upgrade</span><span class="p">;</span><span class="w">
</span><span class="kn">proxy_set_header</span><span class="w"> </span><span class="s">Upgrade</span><span class="w"> </span><span class="nv">$http_upgrade</span><span class="p">;</span><span class="w">
</span><span class="kn">proxy_ssl_server_name</span><span class="w"> </span><span class="no">on</span><span class="p">;</span><span class="w">
</span><span class="kn">proxy_ssl_protocols</span><span class="w"> </span><span class="s">TLSv1</span><span class="w"> </span><span class="s">TLSv1.1</span><span class="w"> </span><span class="s">TLSv1.2</span><span class="w"> </span><span class="s">TLSv1.3</span><span class="p">;</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span></code></pre>
<p>Be aware of these caveats:</p>
<ul>
<li><p>This isn’t a full web proxy that will rewrite hyperlinks and resource URLs in HTML documents.</p></li>
<li><p>Restarting nginx will bring down the connection.</p></li>
<li><p>It is possible to write IPv4 addresses into these domain names. Like
<code>127.0.0.1.localhost</code>.</p></li>
<li><p>The <a class="reference external" href="https://www.nginx.com/resources/wiki/start/topics/examples/x-accel/">X-Accel- headers</a> can be included in responses by upstream servers to get nginx
to behave in interesting ways.</p>
<p>For example, <code>X-Accel-Redirect</code> performs an internal redirect prompting nginx to
process a new URI and even match locations marked <a class="reference external" href="http://nginx.org/en/docs/http/ngx_http_core_module.html#internal">internal</a> in nginx configurations.
As far as I can tell, redirects in this way are evaluated within the <em>same</em> server
block; so the above configuration is relatively benign.</p>
<p>Nevertheless, it’s <em>probably</em> a good idea to use <a class="reference external" href="http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_headers">proxy_ignore_headers</a> to disable this
behaviour.</p>
<pre class="code nginx literal-block"><code><span class="k">proxy_ignore_headers</span><span class="w"> </span><span class="s">"X-Accel-Redirect"</span><span class="w">
</span><span class="s">"X-Accel-Expires"</span><span class="w">
</span><span class="s">"X-Accel-Limit-Rate"</span><span class="w">
</span><span class="s">"X-Accel-Buffering"</span><span class="w">
</span><span class="s">"X-Accel-Charset"</span><span class="p">;</span></code></pre>
</li>
</ul>
<p>So that’s it.</p>
<p>Now you know how to configure nginx, a reverse proxy <em>typically</em> used to <em>terminate</em> TLS
for incoming connections from the <span class="hl-purple">internet</span> to your <span class="hl-yellow">internal
network</span>, as a reverse reverse proxy to <em>de-terminate</em> TLS for connections from our
<span class="hl-yellow">internal network</span> to the <span class="hl-purple">internet</span>.</p>
<p>And by offloading TLS to nginx and sharing the socket between processes with systemd or
SCM_RIGHTS, a program can exit, restart, and resume without dropping the WebSocket.</p>
<p>Remember, if you want to see more blog posts like this one – that are made by copying
documentation from places and talking about it – be sure to subscribe to my follows on
OnlyFans at http://gofundme.subscribestar.librapay.com.localhost/patreon</p>
</section><p>Using nginx as a reverse reverse proxy to de-terminate TLS for connections
from our internal network to the internet.</p>
2021-09-22T12:00:00-07:00https://froghat.ca/2019/11/contentContent2019-11-22T12:00:00-08:00sqwishy<p>I started writing some posts a while ago but never finished them. That’s why my blag has been idle.
I am taking a break to write about one of the things keeping up me at night lately.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>Or, skip this diatribe about my motivations and jump to either the <a class="reference internal" href="#tldr">tldr</a>, where I mention
some cool things others have done, or my <a class="reference internal" href="#shower-thoughts">shower thoughts</a> at the end.</p>
<p>This first section is basically; REST doesn’t specify for filtering and shaping our
data. So every REST API does its own thing and a lot of the time it’s horrible.</p>
</aside>
</aside>
</aside>
<p>I was writing about <a class="reference external" href="https://graphql.org/">GraphQL</a>. I produced a small number of petty criticisms
when I realized that it’s actually a really okay serialization format for a very
specific kind of structure: a node (or a field) with a <em>name</em>, an optional <em>alias</em>,
optional <em>arguments</em>, and optional <em>children</em> node/fields.<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a>
If we were to encode this sort of structure in a format like JSON, you might end up with
something either verbose or awkward looking like <a class="reference external" href="https://jmap.io/spec-core.html#example-request">this JMAP example</a>.
Either we encode the node as a list/tuple, which is not very
human-readable, or some sort of mapping/associative array, where recurring keys are a
bit wasteful and either way is more verbose than what GraphQL accomplishes.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>I’m simplifying a lot here and I also haven’t used GraphQL in a long time so my
impressions are probably dated.
You should probably be careful about how seriously you take what I write here.</p>
</aside>
</aside>
</aside>
<p>GraphQL is pretty generic and flexible and we can use it to define a variety of ways to operate over data.
This is in contrast to something like <a class="reference external" href="https://en.wikipedia.org/wiki/Representational_state_transfer">REST</a> which is really more suited to <a class="reference external" href="https://en.wikipedia.org/wiki/Create,_read,_update_and_delete">CRUD</a> operations.
Specifically, so I’m not just throwing buzzwords around, I think a good REST API is
about having four verbs that you can use to operate over different structures in similar
and predictable ways,
while a GraphQL API can accommodate models that are special snowflakes that each have
different verbs and little to nothing in common with each other.</p>
<p>Maybe I’m a silly person who smoked too much <a class="reference external" href="https://en.wikipedia.org/wiki/Duck_typing">duck typing</a>,
but I get the feeling that GraphQL is about managing complexity and REST is
about exposing something that resembles a connection to a database.
Sometimes your domain models are complex.
But, and I could be wrong here, it seems like most of the web is an HTML form in front of a database.
Most of the web is not APIs for
threading, creating parsers, leftpad, or drawing graphics.
It’s reddit and dumb blags for people to complain about web technologies that
they don’t fully understand.
It’s mostly about moving content around or getting presentations of specific content.</p>
<p>In principle, REST should be great if all I want to do is some super simple CRUD stuff,
right? Endpoints like <code>/collection</code> and <code>/collection/item</code>
map closely to the collections/tables/objects and documents/rows/instances I
have.</p>
<p>But that alone isn’t very powerful. If we want to make a REST API for some content,
we probably want to do things like filtering and pagination.
Let’s think about how we would make one and
add features until it looks all fucked up.</p>
<p>We allow the client to filter for items by testing equality on some field.
Lets use the query string for this; <code>/posts?title=Bananas</code>. Great.
And now we need to filter on another field. But, it’s not for equality, it’s a full text
search; <code>/posts?content=yellow+fruit</code>.
It looks simple. Simplicity is good, right?</p>
<p>This is a great design for a few reasons.
First, things get awkward if the API needs to support some other expression on one of
these fields, like a case-insensitive substring match on the title or something.
Second, Since our field comparisons are implicit, it’s not obvious to someone who isn’t
very familiar with this particular endpoint what comparisons are used by the
filter. Fortunately, we can document our bespoke endpoint filters to solve that.</p>
<p>Not least of all, you can trick a clever user who is trying to make an inference about one
endpoint’s filter behaviour based on the behaviour of another endpoint.
Like if they know how <code>/posts?content=your+blag+sucks</code> works, they might assume that
<code>/comments?content=your+blag+still+sucks</code> does the same thing.
But it turns out this was a trap all along.
Even though those filters were intended to be the same,
they used to be substring matches
before someone realized that a full text search
was way better and changed it.
But they only changed one endpoint, forgetting the filter for the other.
Now the filters between both endpoints are different,
but not so different enough that anybody will even really notice
for a couple years, if ever. They might just have a feeling that searching one resource doesn’t
quite work as well as the other for some reason. It’s perfect.</p>
<p>The next thing we can do is nest resources and endpoints.
Endpoints like <code>/thread/<id>/comments</code> are just filtered queries on <code>/comments</code>,
something like <code>/comments?thread__eq=<id></code>. You don’t even need them but you can trick
everyone into thinking you do by not having a <code>/comments</code> endpoint at all.
And users can’t even query for comments without a filter.</p>
<p>Later, you might add <code>/user/<id>/comments</code> that returns comments authored by a particular user.
But since endpoints aren’t composable (we can’t logically conjunct endpoints together)
there isn’t an intuitive way to filter comments on <em>both</em> their author and the post they
belong to.
To solve this, lets allow query string filters like everywhere else. But make sure that
<code>/thread/<id>/comments</code> and <code>/user/<id>/comments</code> accept different query parameters and
different filters. Either on purpose or by accident because they use different
implementations and there was a copy-paste mistake.
But we can solve this getting someone to document the inconsistencies.
It turns out you can cut corners and not bother with creating a self-describing
interface if you can just refer your users to literature every time they can’t figure
out what a button does.<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>And while you’re adding documentation for this, make sure that the documentation we
wrote for that other thing earlier isn’t out of date yet.</p>
</aside>
</aside>
</aside>
<p>Now we have a REST API made out of bespoke snowflake endpoints that infuriates the
living fuck out of the everyone that uses it or supports it.
The only thing that really happened is that we wanted to parameterize filtering and
we came up with something that was not expressive enough to work generally across each
of our collections. So instead of dealing with that at the door, we made it the concern
of every endpoint.</p>
<section id="tldr">
<h2>tldr<a class="self-link" title="link to this section" href="#tldr"></a></h2>
<p>The whole point of the above wall of text is only to say that expressing how to fetch
data is kind of hard and there are a lot of bad solutions.<a class="footnote-reference superscript" href="#footnote-4" id="footnote-reference-4" role="doc-noteref"><span class="fn-bracket">[</span>4<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-4" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-4">4</a><span class="fn-bracket">]</span></span>
<p>SQL is amazing but it’s quite complicated. I’m pretty sure I learn
something new every time I look at SQL database engine user documentation.</p>
</aside>
</aside>
</aside>
<p>Most of the GraphQL APIs I’ve seen do a similar thing as the
above. But, it seems like you can get away with a lot more because
GraphQL APIs have schemas. Presumably, mistakes in clients can be caught early
by programming against an explicit and typed interface. The consequences of having a
complex interface is minimized by tooling.</p>
<p>Instead of encoding our filters into a URL query string,
we can encode it as part of a GraphQL query and pass it either in the URL of a GET request
or as data in a POST request if our API does away with GET requests entirely.</p>
<p>Here’s an example from the GraphQL project website:<a class="footnote-reference superscript" href="#footnote-5" id="footnote-reference-5" role="doc-noteref"><span class="fn-bracket">[</span>5<span class="fn-bracket">]</span></a></p>
<pre class="code graphql literal-block"><code><span class="p">{</span>
<span class="nc">human</span><span class="p">(</span>id: <span class="s2">"1000"</span><span class="p">)</span> <span class="p">{</span>
<span class="nc">name</span>
<span class="nc">height</span>
<span class="p">}</span>
<span class="p">}</span></code></pre>
<p>… which looks up and returns a human with that id.</p>
<aside class="footnote-list superscript">
<aside class="footnote superscript" id="footnote-5" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-5">5</a><span class="fn-bracket">]</span></span>
<p><a class="reference external" href="https://graphql.org/learn/queries/">https://graphql.org/learn/queries/</a></p>
</aside>
</aside>
<p>Some GraphQL APIs are creative with parameter names and include operators in them, like
<code>human(name__contains: ...)</code> or something like that.
And some REST APIs let you do this with query string filtering too, like <code>/human?name__contains=...</code>.
It’s basically the same thing.
(Except GraphQL is typed, while I think x-www-form-urlencoded
encodes the boolean <code>true</code> and the string <code>"true"</code> the same way which can be a pain.)</p>
<p>It’s worth mentioning that
GraphQL makes it natural for clients to specify the shape of the
data in the selection set; what fields we want the API to return and any related records
as well.</p>
<p><a class="reference external" href="https://github.com/PostgREST/postgrest">PostgREST</a> is a really fun and interesting program. Their <a class="reference external" href="http://postgrest.org/en/stable/api.html#resource-embedding">syntax for returning
related records</a> was apparently inspired by GraphQL. It lets us write something like
<code>/users?select=*,comments(*)&comments.thread=123</code> which would return a set of users and
their authored comments for a specific thread.
PostgREST was also designed with the awareness that nested endpoints are filters.
When you think about it, even <code>/table/<id></code> is just a filter.
So write <code>/table?id=eq.123</code> instead. This has the advantage of removing the question of
how to build an item endpoint for a model with a composite primary key.</p>
<p>I also want to mention <a class="reference external" href="http://htsql.org">HTSQL</a>.
I haven’t heard anybody else talk about it and haven’t properly used it myself.
So I don’t really have any insight to offer.
But the design and literature seem super interesting.
It looks expressive and powerful,
but I’d also expect this to mean that
it’s more difficult to implement and a bit harder to learn.
Although, I think a goal of theirs was to create something easier to digest than SQL.</p>
<p>Admittedly, more expressiveness isn’t necessarily worth the complexity it may introduce.
Let’s suppose we want to find posts, but only ones made since their author’s last login. Using
HTSQL, we can write something like
<code>/posts{title,author{name}} ?published_at>author.last_login</code> which looks like.</p>
<pre class="code json literal-block"><code><span class="p">{</span><span class="w">
</span><span class="nt">"posts"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"interesting post"</span><span class="p">,</span><span class="w">
</span><span class="nt">"author"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Spongebob"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nt">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"why starfish are the best"</span><span class="p">,</span><span class="w">
</span><span class="nt">"author"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Patrick Star"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">}</span></code></pre>
<p>This is pretty cool. I’m not saying that this here exactly is the line,
but there probably is some point where queries/requests become so esoteric that
giving them first-class support might not be worth it.</p>
<p>PostgREST has a bit <a class="reference external" href="http://postgrest.org/en/stable/api.html#computed-columns">on computed columns</a> which can be used to define interesting
expressions on the server that are usable by clients. Which is better than nothing and
preserves the purity/simplicity of the filter and selection syntax.</p>
<p>PostgREST also has a whole thing for doing executing <a class="reference external" href="http://postgrest.org/en/stable/api.html#s-procs">stored procedures</a>, for when your
models have special verbs that don’t fit the CRUD model. So that’s a cool thing worth
mentioning. Because even if 90% of the APIs you’re trying to create on the web is just
for content that fits CRUD, the other 10% is important enough that, if you
ignore it, it’s likely to take as much or more time to solve than the first 90%.
And that sucks.</p>
<p>An important fact of PostgREST and HTSQL is that they operate using introspection.
Since the features are written for database concepts, any kind of
snowflakeyness must exist in the database that is being reflected.
And we don’t have any stupid excuses like a we didn’t copy-paste the filter handling
code correctly between endpoints because they share the same code in the first place.<a class="footnote-reference superscript" href="#footnote-6" id="footnote-reference-6" role="doc-noteref"><span class="fn-bracket">[</span>6<span class="fn-bracket">]</span></a>
Business rules are naturally idiosyncratic and minimizing that so they don’t leak into
our programming interfaces is valuable.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-6" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-6">6</a><span class="fn-bracket">]</span></span>
<p>Reusing code to create patterns and simplify things for users is a whole thing that
one could spend a lot of time going into. But, I’m pretty sure everyone
is already aware of how this works.
People don’t execute on this knowledge in part because politics.</p>
<p>But also it’s hard to find the appropriate pattern.
Doing it wrong is often worse than not doing it at all.</p>
</aside>
</aside>
</aside>
<p>Here’s a picture of someone else’s cat to break up the text.</p>
<figure class="figure">
<a class="reference external image-reference" href="cat.jpg"><img alt="A kitty rolled up sleeping on someone's sofa." src="cat.jpg" /></a>
</figure>
</section><section id="shower-thoughts">
<h2>shower thoughts<a class="self-link" title="link to this section" href="#shower-thoughts"></a></h2>
<p>This is getting too long, but I wanted to include some garbage I’ve been thinking about
off and on for quite some time.</p>
<p>Interacting with content follows similar patterns.
You usually want to a combine or compose the following:</p>
<ul class="simple">
<li><p>Fetch content. Which might involve filtering, ordering, limits, or offsets.</p></li>
<li><p>Mutate something either modifying fetched content or creating new content.</p></li>
<li><p>Present fetched content with some shape maybe using some kind of pagination.</p></li>
</ul>
<p>Performing this interaction using GraphQL or URL query strings is about encoding this
information into a fairly human readable and writable string.</p>
<p>At the risk of getting much deeper into this,
I’ll mention <a class="reference external" href="https://docs.mongodb.com/manual/tutorial/query-documents/#specify-and-as-well-as-or-conditions">MongoDB</a> real quick.
I find it interesting because you can write an expression as a JavaScript object
using some special notation for operators. In the request, the structure is serialized to BSON.
Being a binary format, BSON is easier for machines to parse than text formats<a class="footnote-reference superscript" href="#footnote-7" id="footnote-reference-7" role="doc-noteref"><span class="fn-bracket">[</span>7<span class="fn-bracket">]</span></a>
and it supports quite a few more types than JSON so applications don’t have to
guess at if something is a date or a string.</p>
<pre class="code javascript literal-block"><code><span class="p">{</span><span class="w"> </span><span class="nx">$or</span><span class="o">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nx">qty</span><span class="o">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nx">$lt</span><span class="o">:</span><span class="w"> </span><span class="mf">30</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">},</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nx">item</span><span class="o">:</span><span class="w"> </span><span class="sr">/^p/</span><span class="w"> </span><span class="p">}</span><span class="w"> </span><span class="p">]</span><span class="w"> </span><span class="p">}</span></code></pre>
<aside class="aside">
<aside class="footnote superscript" id="footnote-7" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-7">7</a><span class="fn-bracket">]</span></span>
<p>On the other hand,
suggesting that a binary query format will improve performance by reduing time spent parsing is probably naive.
Your program probably isn’t spending enough time at the parsing step for a reduction
in time to matter when handling a request.
Also, you can cache query strings to their internal representations.
I’m pretty sure most GraphQL servers and database engines already do this.</p>
</aside>
</aside>
</aside>
<p>Structured queries like this are interesting because the format is more to do with how
to arrange native types, rather than arranging a string.
(Assuming you can send your structure over the wire by using a suitable serialization
format.)</p>
<p>On the other hand, strings are universal.
Some programs may not have strong notions of objects or structures.
Like maybe something written in bash (or <a class="reference external" href="https://en.wikipedia.org/wiki/Scratch_(programming_language)">Scratch</a>?). Or maybe some weird domain-specific
languages. I’m not sure.</p>
<p>Part of me believes in a future where we can compose and serialize generic query
structures that describe what content to gather or what mutations to perform.
And then, if needed, reference them from some remote procedure call/not CRUD specific format
like GraphQL.</p>
<p>Because, for putting content on the web, the databases we make and the engines they use
important, if not the most important things.
But our time is spent, over and over, on writing protocols for moving that
information around and also they can’t talk to each other.</p>
<section id="while-we-re-on-the-topic-of-serialization-formats-and-the-efficiency-of-text-vs-binary">
<h3>while we’re on the topic of serialization formats and the efficiency of text-vs-binary<a class="self-link" title="link to this section" href="#while-we-re-on-the-topic-of-serialization-formats-and-the-efficiency-of-text-vs-binary"></a></h3>
<p>It’s been claimed that binary formats are cheaper because they don’t repeat keys.
For instance, each element of a JSON list is self-describing and will contain all its
keys as strings, even if every element has the same structure. Whereas something like
protobuf doesn’t need to represent object keys as strings because you have a schema to
determine how to map bytestreams to objects and their members.</p>
<p>However, this is has nothing to do with if the format is binary or text.
MessagePack and CBOR are binary formats where objects are self-describing.
Do you know what self-describing text format doesn’t repeat keys for each object in a sequence?
Comma-separated values.<a class="footnote-reference superscript" href="#footnote-8" id="footnote-reference-8" role="doc-noteref"><span class="fn-bracket">[</span>8<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-8" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-8">8</a><span class="fn-bracket">]</span></span>
<p>Almost as well liked as XML.</p>
</aside>
</aside>
</aside>
<p>What if our JSON had a header too?</p>
<pre class="code json literal-block"><code><span class="p">[</span><span class="w">
</span><span class="p">[</span><span class="w">
</span><span class="s2">"title"</span><span class="p">,</span><span class="w">
</span><span class="s2">"published_at"</span><span class="p">,</span><span class="w">
</span><span class="p">{</span><span class="w"> </span><span class="nt">"author"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w"> </span><span class="s2">"id"</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="w"> </span><span class="p">]</span><span class="w"> </span><span class="p">}</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="p">[</span><span class="w">
</span><span class="s2">"interesting post"</span><span class="p">,</span><span class="w">
</span><span class="s2">"Fri Nov 22 11:56:49 PST 2019"</span><span class="p">,</span><span class="w">
</span><span class="p">[</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="s2">"spongebob"</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="p">[</span><span class="w">
</span><span class="s2">"why starfish are the best"</span><span class="p">,</span><span class="w">
</span><span class="s2">"Fri Nov 22 13:12:32 PST 2019"</span><span class="p">,</span><span class="w">
</span><span class="p">[</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w"> </span><span class="s2">"Patrick Star"</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">]</span></code></pre>
<p>I imagine this is how a lot of wire protocols for relational tabular database already
work. So this is probably not a new thing.
I just think it’s pretty memes that CSV has this over every other popular serialization format.
I mean, I don’t really know if this is a meaningful improvement;
maybe it’s something to check out.
I think it’s fair to say that this impairs human-readability at least a little bit,
especially as things nest.
But that’s kind of irrelevant for binary formats anyway.</p>
</section></section><p>Querying web APIs is an awful experience.</p>
2019-11-22T12:00:00-08:00https://froghat.ca/2019/07/tooting-in-a-portable-serviceTooting in a portable service2019-07-27T12:00:00-07:00sqwishy<p>Earlier this month, the drive for my root filesystem on my home server failed.
After replacing the drive, <a class="reference external" href="https://joinmastodon.org/">Mastodon</a> needed a bit of lovin’ to get working again.
So I thought about how I could complicate it by using containers to isolate it from
the host.
I wrote about it in my previous post, <a class="reference external" href="../tooting-in-a-container">Tooting in a container</a>.</p>
<p>A couple hundred hours later,
my new drive failed as it, and the former, were apparently involved in a suicide pact.</p>
<p>I’ve been waiting patiently
for the world’s slowest retailer of computer equipment
to transport some replacement drives to a nearby
outlet for me to buy & pick up.
I got them the other day and set everything up again, this
time with raid1 mirroring in LVM.
I also rebuilt it with some parts from an old employer and the machine now boasts
one gigabit ethernet. 2008, here I come!</p>
<p>While I was waiting for the replacement disks, I thought about how to redo my Mastodon
install.
I considered using LXD because it’s fun to use
and fairly well documented
and it makes it easy to <a class="reference external" href="https://stgraber.org/2017/06/15/custom-user-mappings-in-lxd-containers/">map user and groups</a> between a host and its unprivileged containers.</p>
<aside class="aside">
<p>I might still use LXD to replace the Docker containers running <a class="reference external" href="https://sonarr.tv">sonarr</a> and <a class="reference external" href="https://radarr.video">radarr</a>.
Since those are packaged for Arch Linux, they might be easier to keep up-to-date.</p>
</aside>
<p>A concern was that it seemed geared towards running machines as containers
rather than single services.
While you can configure a container’s init process to be any program,
doing so is not at all typical.
Moreover, I wanted something that worked nicely with systemd in terms of
logging and service management.
I’m still not quite sure if that is difficult with LXD.
I haven’t found an obvious solution like integration with <code>machinectl</code> or anything.</p>
<p>I stumbled upon <a class="reference external" href="https://systemd.io/PORTABLE_SERVICES">systemd’s portable services feature</a>/<code>portablectl</code>
and it seems pretty appropriate.
It’s not as comprehensive as something like LXD,
but it gives me isolated services that I can interact with through <code>systemctl</code> and
<code>journalctl</code> as if they were any other service. Which is really nice.
And making multiple systemd services available from one portable service container
thingy is handy for Mastodon since it has three services, <code>mastodon-web</code>,
<code>mastodon-streaming</code>, <code>mastodon-sidekiq</code>. Getting that to work was a problem with my
<code>systemd-nspawn</code> approach.</p>
<p>Previously, I had to separate out the service’s home directory (which is used
for runtime, configuration, and state…) and bind mount that over the read-only root
filesystem in the container. This was mainly a result of the three services detail.
I don’t think this is necessary with portable services.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a> But I’ve
included it in the configuration below because that’s the way that it is.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>In an ideal world you probably would have application configuration & state made
available from the host to the portable service through a bind mount or something, so
that updating the service can be done simply by replacing the rootfs with one that
includes the new runtime. With Mastodon, however, the runtime, configuration, and
state and everything are not self-contained.</p>
<p>See also <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/systemd.exec.html">systemd.exec(5)</a> and options like <code>StateDirectory</code> and
<code>ConfigurationDirectory</code> (and others). They seem to be a nice mechanism for this.</p>
</aside>
</aside>
</aside>
<p>The steps for building the portable service that I’m using
is nearly the same as my previous post. Except:</p>
<ul>
<li><p>The portable service rootfs is at <code>/var/lib/portables/mastodon</code>, instead of
<code>/var/lib/machines/mastodon</code>.</p></li>
<li><p>The <a class="reference external" href="https://github.com/tootsuite/mastodon/tree/v2.9.2/dist">service files for Mastodon</a>, found in the sources under <code>dist</code>, need to be
copied to a place (in the “container”) where systemd unit files are normally found.
I put mine in <code>/usr/local/lib/systemd/system</code>.<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a></p></li>
<li><p>After attaching the portable service, <code>portablectl attach mastodon</code>,
supplement each service with the following overwrite.
For me, these went to:</p>
<ul class="simple">
<li><p><code>/etc/systemd/system.attached/mastodon-web.service.d/30-memes.conf</code></p></li>
<li><p><code>/etc/systemd/system.attached/mastodon-streaming.service.d/30-memes.conf</code></p></li>
<li><p><code>/etc/systemd/system.attached/mastodon-sidekiq.service.d/30-memes.conf</code></p></li>
</ul>
<p>You probably want to link them if you want them all the same and maybe change them.</p>
<p>Below is the contents of this overwrite file with some comments.</p>
</li>
</ul>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>Maybe they should be in <code>/usr</code>
instead of <code>/usr/local</code> but I always feel bad whenever I think about doing that
because I’m not a package maintainer and <code>/usr</code> is for important packages from
important maintaines.</p>
</aside>
</aside>
</aside>
<section id="memes-conf">
<h2>30-memes.conf<a class="self-link" title="link to this section" href="#memes-conf"></a></h2>
<pre class="code ini literal-block"><code><span class="k">[Service]</span><span class="w">
</span><span class="c1"># This allows the service to communicate to</span><span class="w">
</span><span class="c1"># postgres over a UNIX socket.</span><span class="w">
</span><span class="na">BindPaths</span><span class="o">=</span><span class="s">/var/run/postgresql</span><span class="w">
</span><span class="c1"># Without this it seems like the service won't be</span><span class="w">
</span><span class="c1"># able to access /home/mastodon. systemd.exec(5)</span><span class="w">
</span><span class="c1"># suggests that, by setting this to tmpfs, we can</span><span class="w">
</span><span class="c1"># make our service home directory available to the</span><span class="w">
</span><span class="c1"># service with BindPaths.</span><span class="w">
</span><span class="na">ProtectHome</span><span class="o">=</span><span class="s">tmpfs</span><span class="w">
</span><span class="na">BindPaths</span><span class="o">=</span><span class="s">/home/mastodon</span><span class="w">
</span><span class="c1"># My systemctl show-environment doesn't have</span><span class="w">
</span><span class="c1"># /bin in PATH. But the portable services will</span><span class="w">
</span><span class="c1"># expect it for /bin/bash and maybe other things.</span><span class="w">
</span><span class="na">Environment</span><span class="o">=</span><span class="s">PATH=/usr/local/bin:/usr/bin:/bin</span><span class="w">
</span><span class="c1"># Needed for some ruby gem "blurhash" to use ffi...?</span><span class="w">
</span><span class="na">MemoryDenyWriteExecute</span><span class="o">=</span><span class="s">no</span></code></pre>
<p>You may need to suit this to your needs if you want to use it. Some things, like the
<code>Environment</code> and <code>MemoryDenyWriteExecute</code> options, probably ought to be part of the
service file that is published from the container; since they have to do with
Mastodon’s ability to function as a portable service and less to do with the host
configuration. But I tried leaving Mastodon’s service files unmodified so I can
overwrite them if they change upstream without having to keep track of my modifications.</p>
<pre class="code full-width literal-block"><code>sqwishy@banana ~> sudo systemctl status -n0 mastodon-{web,sidekiq,streaming}
● mastodon-web.service - mastodon-web
Loaded: loaded (/var/lib/portables/mastodon/usr/local/lib/systemd/system/mastodon-web.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system.attached/mastodon-web.service.d
└─10-profile.conf, 20-portable.conf, 30-memes.conf
Active: active (running) since Sat 2019-07-27 16:51:19 PDT; 9min ago
Main PID: 17531 (bundle)
Tasks: 29 (limit: 4649)
Memory: 410.3M
CGroup: /system.slice/mastodon-web.service
├─17531 puma 3.12.1 (tcp://0.0.0.0:3000) [live]
├─17607 puma: cluster worker 0: 17531 [live]
└─17609 puma: cluster worker 1: 17531 [live]
● mastodon-sidekiq.service - mastodon-sidekiq
Loaded: loaded (/var/lib/portables/mastodon/usr/local/lib/systemd/system/mastodon-sidekiq.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system.attached/mastodon-sidekiq.service.d
└─10-profile.conf, 20-portable.conf, 30-memes.conf
Active: active (running) since Sat 2019-07-27 16:51:21 PDT; 9min ago
Main PID: 17560 (bundle)
Tasks: 12 (limit: 4649)
Memory: 205.2M
CGroup: /system.slice/mastodon-sidekiq.service
└─17560 sidekiq 5.2.7 live [0 of 5 busy]
● mastodon-streaming.service - mastodon-streaming
Loaded: loaded (/var/lib/portables/mastodon/usr/local/lib/systemd/system/mastodon-streaming.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system.attached/mastodon-streaming.service.d
└─10-profile.conf, 20-portable.conf, 30-memes.conf
Active: active (running) since Sat 2019-07-27 16:51:19 PDT; 9min ago
Main PID: 17526 (node)
Tasks: 18 (limit: 4649)
Memory: 50.6M
CGroup: /system.slice/mastodon-streaming.service
├─17526 /usr/bin/node ./streaming
└─17551 /usr/bin/node /home/mastodon/live/streaming</code></pre>
<p>Lennart Poettering also has <a class="reference external" href="http://0pointer.net/blog/walkthrough-for-portable-services.html">a walkthrough for portable services</a> on his blag; check it
out if you’re interested in looking at some proper instructions and explanation.</p>
</section><p>Running <a class="reference external" href="https://joinmastodon.org/">Mastodon</a> services inside of a <a class="reference external" href="https://systemd.io/PORTABLE_SERVICES">systemd portable service</a>.</p>
2019-07-27T12:00:00-07:00https://froghat.ca/2019/07/tooting-in-a-containerTooting in a container2019-07-07T12:00:00-07:00sqwishy<p>I run an instance of <a class="reference external" href="https://joinmastodon.org/">Mastodon</a> on a Fedora host.
Building Mastodon is a bit of a pain, so I’ve made it worse by trying to build & run it
under an <a class="reference external" href="https://alpinelinux.org/">Alpine Linux</a> container with <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html">systemd-nspawn</a>.</p>
<p>But, before I poorly document what I did to get it to work,
I have a rant about what I had on my mind while I was doing this to myself.</p>
<section id="rant">
<h2>Rant<a class="self-link" title="link to this section" href="#rant"></a></h2>
<p>A decade ago, I had a different attitude toward my machines than I do now.
I was comfortable installing whatever packages or programs
to build some one-off thing
or run a service that I would use for a month or so
until something broke or I got bored.
None of it really mattered for long.</p>
<p>For the most part, I didn’t deal with the sort of problems that you solve
if you want to compile or distribute anything for someone else to run.
When writing software for myself, or for a single configuration,
not a lot of thought goes into organization around source, build, runtime,
configuration, and data.
It all kinda blends into one thing–maybe more so with interpreted languages.
In my personal experience, for what that’s worth, this is a common attitude.
Especially in web development.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>I think some computer games designed for Windows are a little similar.
One in particular had a Linux server that required write
permission and to be the owner of every file that made up its program install and
that the working directory at runtime to be the installation directory. Some of its runtime
state would be written under the install directory. As a result, running the
program simultatiously from a single install with multiple configurations could
only be accomplished with <code>overlayfs</code>.</p>
</aside>
</aside>
</aside>
<p>For a time, it was (and may still be?) quite popular to include
minified/bundled JavaScript or compiled/transpiled TypeScript/CoffeeScript
with the source code in the <abbr title="Version Control System">VCS</abbr>
for software projects in those languages.
Amazon scans public git repositories on GitHub for AWS secrets
because people version control that information along side source code.</p>
<p>Treating your sources, build artefacts, and deployment secrets as separate
things with different objectives in space & time
is more complicated (more work) compared to treating them all the same.
I’ve worked with people who, for that reason,
don’t believe that they should think about these things.<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>At a new job, I found we had a website in Node.js that served statc files
(among other things) over HTTP. They also used git to push updates with the
sources. I discovered you could <code>git clone https://...</code> the website to copy
the entire thing and its history without any authentication.</p>
</aside>
</aside>
</aside>
<p>The culture improvised and came up with their own designs and
I remember people struggling to use “task runners”, like Bower or Grunt,
as glorified shell scripts with flaky parallelism & incremental rebuilds thrown on after-the-fact.
While build systems, even those as simple as <a class="reference external" href="https://ninja-build.org">ninja</a>, fit the purpose well.
And I’d be disregarded when advocating for these programs on the basis that they’re “antiquated”;
as well as when advocating for Go for the opposite reason.</p>
<p>I can’t shake the idea that Docker is a similar thing
in that it allowed developers to go from
running their Node.js services as root in <code>screen</code> without supervision
to running them as root in Docker supervised by Kubernetes
or something cloud like that.
And maybe that’s a good thing.
Maybe <code>screen</code> <em>was</em> the best they could ever do.
And Docker is preferable to that
and cool enough that people are motivated to use it.</p>
<aside class="aside">
<p>At some point, did we miss the part where sometimes developers and operators have
different relationships to software?
Or that maybe programs shouldn’t have the same relationships to its binaries & its state/data?
I don’t know. Maybe I’m just afraid of change.</p>
</aside>
<p>But I wonder if the momentum of modern software development
has been set by people who, like me,
were oblivious to software life-cycle doctrines in other ecosystems.
Like how init systems, package management, and build systems were used to solve
problems.</p>
<p>As someone who doesn’t need their containers to auto-scale,
running a service with Docker has yet to be
easier than running a service by installing a package.
Techniques to decouple applications from their hosts,
with things like static linking or containers,
seem like the responsibility of system administration.
Not something that developers should generally impose.</p>
<p>Also, I don’t hate containers.
I think it’s cool that one can copy a container to another system and run it there.
(Sort of like building packages on one machine and installing them on another.)
I believe namespaces and isolation are useful and can be used to solve problems.
I like that I can make an OpenSUSE container on my Fedora host to build
RPMs for an OpenSUSE system without a chroot or a VM or whatever.
Because, man, do I hate VMs.</p>
</section><section id="the-container">
<h2>The Container<a class="self-link" title="link to this section" href="#the-container"></a></h2>
<aside class="aside">
<p>This was done when Mastodon was at version 2.9.2.</p>
</aside>
<p><a class="reference external" href="https://docs.joinmastodon.org/administration/installation/">Mastodon’s installation instructions</a> tell you to install Node.js and npm and
Yarn and stuff in order to build and run its three services, mastodon-sidekiq,
mastodon-web, mastodon-streaming.</p>
<p>It seems odd to me, but it looks like they have you build a specific version of Ruby
specially for Mastodon, instead of using the system Ruby.<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>You’re linking this special Ruby against your system libraries though.
So when you upgrade system packages and they are no longer ABI compatible, Mastodon breaks.
So I’m not entirely sure what the point of that is.</p>
</aside>
</aside>
</aside>
<p>Alternatively,
they provide some instructions on how you can use
docker-compose to run Mastodon and its dependent services, Redis & PostgreSQL.
Now, I want to run those two services on my host.
But I can’t even figure out
how to get Docker to work with my init system properly,
so I definitely don’t know how to convey dependencies
between docker-compose containers and services running on my host.
And figuring that out isn’t as interesting as all this.</p>
<aside class="aside">
<p>These steps I did on a more powerful machine than the server
that Mastodon will eventually run on.</p>
</aside>
<p>First, I grabbed and unpacked the desired Alpine Linux rootfs
(see their <a class="reference external" href="https://alpinelinux.org/downloads/">downloads page</a>) with <code>machinectl</code>. This unpacked
the container’s rootfs at <code>/var/lib/machines/mrtooty</code>.</p>
<pre class="code shell full-width literal-block"><code>machinectl<span class="w"> </span>pull-tar<span class="w"> </span>--verify<span class="o">=</span>checksum<span class="w"> </span><span class="se">\
</span><span class="w"> </span>http://dl-cdn.alpinelinux.org/alpine/v3.10/releases/x86_64/alpine-minirootfs-3.10.0-x86_64.tar.gz<span class="w"> </span><span class="se">\
</span><span class="w"> </span>mrtooty</code></pre>
<p>We use could run <code>systemd-nspawn -M mrtooty</code> to get
a shell, but if we do, we’ll get some complaints about our tty and job control won’t
work. I don’t know why. Thankfully, <a class="reference external" href="https://github.com/systemd/systemd/issues/1431#issuecomment-197586093">this helpful comment on github</a> tells us what
to do about it. So we can use the following to run the <code>ash</code> shell in our container.</p>
<pre class="code shell full-width literal-block"><code>systemd-nspawn<span class="w"> </span>-M<span class="w"> </span>mrtooty<span class="w"> </span>--<span class="w"> </span>/sbin/getty<span class="w"> </span>-nl<span class="w"> </span>/bin/ash<span class="w"> </span><span class="m">0</span><span class="w"> </span>/dev/console</code></pre>
<aside class="aside">
<p>I prefer the <code>fish</code> shell to <code>ash</code>, so at this point I installed fish with <code>apk add
fish</code> and started using fish. But you can do whatever you want.</p>
</aside>
<p>Some of the instructions for Mastodon are specific to Ubuntu. But basically it tells you
to install Node.js, npm, Yarn, and a bunch of packages with development headers so you
can build Mastodon or Ruby or something. This is what I ended up installing in order to
build Ruby and run things.</p>
<pre class="code shell full-width literal-block"><code>apk<span class="w"> </span>add<span class="w"> </span>imagemagick<span class="w"> </span>ffmpeg<span class="w"> </span>libpq<span class="w"> </span>postgresql-dev<span class="w"> </span>libxml2<span class="w"> </span><span class="se">\
</span><span class="w"> </span>libxslt<span class="w"> </span>file<span class="w"> </span>git<span class="w"> </span>g++<span class="w"> </span>protobuf<span class="w"> </span>protobuf-dev<span class="w"> </span>pkgconf<span class="w"> </span>nodejs<span class="w"> </span><span class="se">\
</span><span class="w"> </span>npm<span class="w"> </span>gcc<span class="w"> </span>autoconf<span class="w"> </span>bison<span class="w"> </span>yaml-dev<span class="w"> </span>readline-dev<span class="w"> </span>zlib-dev<span class="w"> </span><span class="se">\
</span><span class="w"> </span>ncurses-dev<span class="w"> </span>libffi<span class="w"> </span>gdbm<span class="w"> </span>gdbm-dev<span class="w"> </span>yarn<span class="w"> </span>libidn-dev<span class="w"> </span>icu-dev<span class="w"> </span><span class="se">\
</span><span class="w"> </span>openssl-dev<span class="w"> </span>bash<span class="w"> </span>make<span class="w"> </span>linux-headers<span class="w"> </span>gcompat<span class="w"> </span>su-exec</code></pre>
<p>Two important additions above are:</p>
<ul class="simple">
<li><p><code>gcompat</code>, which provides <code>/lib/ld-linux-x86-64.so.2</code>. Otherwise the
<code>mastodon-streaming</code> service will fail to start because the library is required by
<code>node_modules/@clusterws/cws/dist/cws_linux_64.node</code> under the Mastodon sources.</p></li>
<li><p><code>su-exec</code> which I use in the systemd service file to switch users.</p></li>
</ul>
<p>At this point, I followed their instructions pretty closely.
I made a user for the Mastodon services and started bash.</p>
<pre class="code shell full-width literal-block"><code>apk<span class="w"> </span>add<span class="w"> </span>shadow<span class="w">
</span>useradd<span class="w"> </span>mastodon<span class="w"> </span>--system<span class="w"> </span>--create-home<span class="w">
</span>su<span class="w"> </span>-s<span class="w"> </span>/bin/bash<span class="w"> </span>-<span class="w"> </span>mastodon</code></pre>
<p>The rest of instructions for <a class="reference external" href="https://docs.joinmastodon.org/admin/install/#installing-ruby">installing Ruby</a> should
pretty much work.
Except I didn’t build with jemalloc because it’s
not available in Alpine Linux; it’s not minimalist enough for musl or something.</p>
<p>After that, the instructions go on to talk about PostgreSQL which I’ll skip since I’m not covering that here.</p>
<p>The
<a class="reference external" href="https://docs.joinmastodon.org/administration/installation/#setting-up-mastodon">“Setting up Mastodon”</a>
section which checks out the Mastodon source code into <code>~/live</code>
(still as the <code>mastodon</code> user)
and running <code>bundle</code> and <code>yarn</code> is important and should <em>just work</em>.</p>
<p>We can quit out of our shell to stop the container. At this point, I’m
moving the container over to the host that it will run on.</p>
</section><section id="the-toot">
<h2>The Toot<a class="self-link" title="link to this section" href="#the-toot"></a></h2>
<p>This is where things get really stupid. But, by the time I had realized my mistake, so
many compromises were made that it was too late to turn back.</p>
<p>The trick is to run the three mastodon services
(mastodon-sidekiq, mastodon-web, mastodon-streaming)
on a single rootfs.</p>
<p>If you try and start an nspawn-machine that is already running, it will tell you that
the rootfs is busy. And apparently this is <a class="reference external" href="https://lists.freedesktop.org/archives/systemd-devel/2018-September/041327.html">a good thing for a good reason</a>.</p>
<p>We can scoot by this
since Mastodon doesn’t need to modify the rootfs while our services are running.
It just needs to modify our its home directory to some degree,
at least for user uploaded attachments.</p>
<p>So here I’ve taken <code>/home/mastodon</code> from the container out of the rootfs directory tree
and put it at <code>/home/mastodon</code> on the host. Now I run <code>systemd-nspawn</code> with a few extra
options which are <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html">documented online</a>:</p>
<ul class="simple">
<li><p><code>--register no</code>; Keeps the container from registering with machinectl. The
machinectl commands won’t really work to control these services anyway since it
requires they run systemd & dbus and they aren’t.</p></li>
<li><p><code>--bind /var/run/postgresql</code>; I connect to PostgreSQL over a UNIX domain socket, so
this makes that work. I’m pretty sure the UID in the container must match a
UID on the host that is authenticated with PostgreSQL for this to work. (Install
<code>postgresql-client</code> and run <code>psql --host /var/run/postgresql -d mastodon_production</code>
in the container to test it out.)</p></li>
<li><p><code>--bind /home/mastodon</code>; This will mount our install of Mastodon–readable and
writable.</p></li>
<li><p><code>--keep-unit</code>; Prevents systemd from creating a new scope for the container. Without
this, only the first service will run and subsequent services will fail to create a
mrtooty scope. I don’t really know what scopes are all about. So hopefully this option
isn’t bad.</p></li>
<li><p><code>--read-only</code>; Sets up the rootfs read only to the containers so that we can start
three services in their own containers using the same rootfs.</p></li>
</ul>
<p>My service files follow. But I’m pretty sure they don’t work right. Like I bet the pid
for each service is for <code>systemd-nspawn</code> so <code>ExecReload</code> sends <code>SIGUSR1</code> to nspawn and
just takes it out.</p>
<section id="mastodon-sidekiq-service">
<h3>mastodon-sidekiq.service<a class="self-link" title="link to this section" href="#mastodon-sidekiq-service"></a></h3>
<pre class="code ini full-width literal-block"><code><span class="k">[Unit]</span><span class="w">
</span><span class="na">After</span><span class="o">=</span><span class="s">network.target</span><span class="w">
</span><span class="k">[Service]</span><span class="w">
</span><span class="na">Type</span><span class="o">=</span><span class="s">notify</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">systemd-nspawn -M mrtooty </span>\<span class="w">
</span><span class="s">--register no </span>\<span class="w">
</span><span class="s">--bind /var/run/postgresql </span>\<span class="w">
</span><span class="s">--bind /home/mastodon </span>\<span class="w">
</span><span class="s">--keep-unit </span>\<span class="w">
</span><span class="s">--read-only </span>\<span class="w">
</span><span class="s">-- /sbin/su-exec mastodon </span>\<span class="w">
</span><span class="s">/usr/bin/env DB_POOL=5 RAILS_ENV=production </span>\<span class="w">
</span><span class="s">bash -c 'cd /home/mastodon/live && /home/mastodon/.rbenv/shims/bundle exec sidekiq -c 5 -q default -q mailers -q pull -q push'</span><span class="w">
</span><span class="na">TimeoutSec</span><span class="o">=</span><span class="s">45</span><span class="w">
</span><span class="na">Restart</span><span class="o">=</span><span class="s">always</span><span class="w">
</span><span class="k">[Install]</span><span class="w">
</span><span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span><span class="w">
</span><span class="k">[Unit]</span><span class="w">
</span><span class="na">After</span><span class="o">=</span><span class="s">network.target</span></code></pre>
</section><section id="mastodon-streaming-service">
<h3>mastodon-streaming.service<a class="self-link" title="link to this section" href="#mastodon-streaming-service"></a></h3>
<pre class="code ini full-width literal-block"><code><span class="k">[Service]</span><span class="w">
</span><span class="na">Type</span><span class="o">=</span><span class="s">notify</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">systemd-nspawn -M mrtooty </span>\<span class="w">
</span><span class="s">--register no </span>\<span class="w">
</span><span class="s">--bind /var/run/postgresql </span>\<span class="w">
</span><span class="s">--bind /home/mastodon </span>\<span class="w">
</span><span class="s">--keep-unit </span>\<span class="w">
</span><span class="s">--read-only </span>\<span class="w">
</span><span class="s">-- /sbin/su-exec mastodon </span>\<span class="w">
</span><span class="s">/usr/bin/env PORT=4000 NODE_ENV=production </span>\<span class="w">
</span><span class="s">bash -c 'cd /home/mastodon/live && /usr/bin/npm run start'</span><span class="w">
</span><span class="na">TimeoutSec</span><span class="o">=</span><span class="s">45</span><span class="w">
</span><span class="na">Restart</span><span class="o">=</span><span class="s">always</span><span class="w">
</span><span class="na">RestartSec</span><span class="o">=</span><span class="s">5</span><span class="w">
</span><span class="k">[Install]</span><span class="w">
</span><span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span></code></pre>
</section><section id="mastodon-web-service">
<h3>mastodon-web.service<a class="self-link" title="link to this section" href="#mastodon-web-service"></a></h3>
<pre class="code ini full-width literal-block"><code><span class="k">[Unit]</span><span class="w">
</span><span class="na">After</span><span class="o">=</span><span class="s">network.target</span><span class="w">
</span><span class="k">[Service]</span><span class="w">
</span><span class="na">Type</span><span class="o">=</span><span class="s">notify</span><span class="w">
</span><span class="na">ExecStart</span><span class="o">=</span><span class="s">systemd-nspawn -M mrtooty </span>\<span class="w">
</span><span class="s">--register no </span>\<span class="w">
</span><span class="s">--bind /var/run/postgresql </span>\<span class="w">
</span><span class="s">--bind /home/mastodon </span>\<span class="w">
</span><span class="s">--keep-unit </span>\<span class="w">
</span><span class="s">--read-only </span>\<span class="w">
</span><span class="s">-- /sbin/su-exec mastodon </span>\<span class="w">
</span><span class="s">/usr/bin/env PORT=3000 RAILS_ENV=production </span>\<span class="w">
</span><span class="s">bash -c 'cd /home/mastodon/live && /home/mastodon/.rbenv/shims/bundle exec puma -C config/puma.rb'</span><span class="w">
</span><span class="na">ExecReload</span><span class="o">=</span><span class="s">/bin/kill -SIGUSR1 $MAINPID</span><span class="w">
</span><span class="na">TimeoutSec</span><span class="o">=</span><span class="s">45</span><span class="w">
</span><span class="na">Restart</span><span class="o">=</span><span class="s">always</span><span class="w">
</span><span class="na">RestartSec</span><span class="o">=</span><span class="s">5</span><span class="w">
</span><span class="k">[Install]</span><span class="w">
</span><span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span></code></pre>
<p>Anyway, don’t do this.
Like most things I do in life, I realized I was probably making a mistake but it was too late to
turn back.
This is a dumb hobby.</p>
</section></section><p>Running <a class="reference external" href="https://joinmastodon.org/">Mastodon</a> services inside of an <a class="reference external" href="https://alpinelinux.org/">Alpine Linux</a> <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html">systemd-nspawn</a> container.</p>
2019-07-07T12:00:00-07:00https://froghat.ca/2019/12/how-to-not-update-dependencies-in-rustHow to not update dependencies in Rust2019-12-30T12:00:00-08:00sqwishy<figure class="no-upscale figure">
<a class="reference external image-reference" href="rust-the-programming-language-the-video-game.png"><img alt="A Discord newcommer looking for a partner to play rust with them. And a reply explaining that the Discord server they're in is about Rust the programming language, not the video game." src="rust-the-programming-language-the-video-game.png" /></a>
<figcaption>How to get out of playing <em>Rust: The Programming Language: The Game</em> with someone
without saying to them that you don’t want to play with them.</figcaption>
</figure>
<p><a class="reference external" href="https://www.rust-lang.org/">Rust</a>, uses a program called Cargo to do things like run tests, invoke a compiler, and
manage dependencies.</p>
<p>A program or library in Rust is referred to by Cargo as a “crate”. Cargo can be
configured to do things for your crate if you give it a TOML document called Cargo.toml.
That file specifies your crate’s dependencies. Cargo will fetch the dependencies
over the information superhighway order to build your crate.
It will also fetch your dependencies’ dependencies.</p>
<p>Suppose your program depends on A version 0.1 and B v0.1.
Cargo will download and use A v0.1 and B v0.1 so it can build your program. Horray!</p>
<p>Now, it has been a while since you picked those versions and you’ve noticed that B v0.2 was
released at some point. So just adjust your dependencies in the Cargo.toml file.</p>
<pre class="code diff literal-block"><code><span class="gd">-B = "0.1"</span><span class="w">
</span><span class="gi">+B = "0.2"</span></code></pre>
<p>But, let’s say A’s API uses B’s API somehow. For example, A has a function
<code>a::do_a_stuff(_: b::B)</code> that takes an object from B as an argument.
Since using B v0.2, you can no longer build invocations of that function in your crate
and the compiler will tell you that it expected you to pass a <code>b::B</code> but instead you
passed a <code>b::B</code>.
Previously, this was a profoundly confusing experience.
Recently, the error message was improved such that it may cause less confusion than before.
It reads:</p>
<pre class="code full-width literal-block"><code>expected struct `b::B`, found a different struct `b::B`
|
= note: perhaps two different versions of crate `b` are being used?</code></pre>
<p>At this point, you may displace your frustrations onto your computer by exclaiming:
“I don’t know, computer. Why are you asking me? You resolved the dependencies.
<em>You</em> did the stupid thing that you’re asking <em>me</em> about.”</p>
<p>It is the case, in this example, that B is a dependency of both our crate and of A.
However, A still depends on B v0.1 while we started using B v0.2.</p>
<p>The sane thing to do would be to use the same versions everywhere,
but we don’t live in that world anymore because too many people on hacker news
complained about setuptools in Python.
Instead, Cargo will install both versions of B and use the appropriate one when
building each crate.</p>
<p>A lot of the time, this works out okay.
You never notice that A required B and the two Bs never know about each other and the
story ends with them living out their lives in blissful ignorance.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>And hopefully the dated packages still receive bugfixes.</p>
</aside>
</aside>
</aside>
<p>But it’s not uncommon to use a library that accepts types,
like UUIDs, dates and times, URIs, or futures, from other libraries that both you and
they then depend on.
In these cases, using different versions of the same crate seems to not work.</p>
<p>Cargo records its dependency resolutions in a file called Cargo.lock,
located adjacent to your Cargo.toml file.</p>
<p>To demonstrate, I made a new Cargo.toml file with a single dependency, <code>rand = "0.6.5"</code>,
a crate used for random number generation.</p>
<pre class="code toml literal-block"><code><span class="k">[dependencies]</span><span class="w">
</span><span class="n">rand</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.6.5"</span></code></pre>
<p>After running <code>cargo build</code> to resolve the dependencies, we should have a Cargo.lock
file.
Each TOML section in that file looks like a package it downloaded. We can even see that
it downloaded two different versions of rand_core for some reason.</p>
<pre class="code toml full-width literal-block"><code><span class="k">[[package]]</span><span class="w">
</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"rand_core"</span><span class="w">
</span><span class="n">version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.3.1"</span><span class="w">
</span><span class="n">source</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry+https://github.com/rust-lang/crates.io-index"</span><span class="w">
</span><span class="n">checksum</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"7a6fdeb83b075e8266dcc8762c22776f6877a63111121f5f8c7411e5be7eed4b"</span><span class="w">
</span><span class="n">dependencies</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s">"rand_core 0.4.2"</span><span class="p">,</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="k">[[package]]</span><span class="w">
</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"rand_core"</span><span class="w">
</span><span class="n">version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.4.2"</span><span class="w">
</span><span class="n">source</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"registry+https://github.com/rust-lang/crates.io-index"</span><span class="w">
</span><span class="n">checksum</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"9c33a3c44ca05fa6f1807d8e6743f3824e8509beca625669633be0acbdf509dc"</span></code></pre>
<p>Perusing the rest of the file, you should see similar sections for other crates. They
may have an item named “dependencies” and, at some point, you should notice that
“rand_core 0.3.1” is listed as a dependency somewhere. Something like this:</p>
<pre class="code toml literal-block"><code><span class="k">[[package]]</span><span class="w">
</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"rand_xorshift"</span><span class="w">
</span><span class="n">version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.1.1"</span><span class="w">
</span><span class="n">dependencies</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s">"rand_core 0.3.1"</span><span class="p">,</span><span class="w">
</span><span class="p">]</span></code></pre>
<p>Back to our example. By reading/grep-ing our Cargo.lock file, we can figure out what
versions of each library are being pulled in, and who requires them. In our case, we
might see something like</p>
<aside class="aside">
<p>Sometimes, entries in the dependencies list will not include a version. My guess is
that this is done when the package referred to only appears once in the Cargo.lock
file, so the version is not required to disambiguate the dependency.</p>
</aside>
<pre class="code toml literal-block"><code><span class="k">[[package]]</span><span class="w">
</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"A"</span><span class="w">
</span><span class="n">version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.1.0"</span><span class="w">
</span><span class="n">dependencies</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s">"B 0.1.0"</span><span class="p">,</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="k">[[package]]</span><span class="w">
</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"B"</span><span class="w">
</span><span class="n">version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.1.0"</span><span class="w">
</span><span class="k">[[package]]</span><span class="w">
</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"B"</span><span class="w">
</span><span class="n">version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.2.0"</span><span class="w">
</span><span class="k">[[package]]</span><span class="w">
</span><span class="n">name</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"MyExampleCrate"</span><span class="w">
</span><span class="n">version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s">"0.2.0"</span><span class="w">
</span><span class="n">dependencies</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s">"A"</span><span class="p">,</span><span class="w">
</span><span class="s">"B 0.2.0"</span><span class="p">,</span><span class="w">
</span><span class="p">]</span></code></pre>
<p>From this, we can confirm that the B package is used twice with two different versions.
And that A depends on a different version of B than what we (MyExampleCrate) depend on.</p>
<aside class="aside">
<p><em>bonus meme:</em> Use <a class="reference external" href="https://github.com/sfackler/cargo-tree">cargo-tree</a> instead of wading through the Cargo.lock file yourself.</p>
</aside>
<p>To solve this, we could use a version of A that uses the version of B that we would like
to use.
But this isn’t possible because the author of A hasn’t released one.
Instead, we’ll use the version of B we were using previously:</p>
<pre class="code diff literal-block"><code><span class="gd">-B = "0.2"</span><span class="w">
</span><span class="gi">+B = "0.1"</span></code></pre>
<p>In conjunction with the patch from earlier, where we switched from using B v0.1 to B
v0.2, the resulting diff is the following:</p>
<!-- Did you really just check the DOM to see if there was an empty <code> element here?
What the hell is wrong with you? -->
<pre class="code diff literal-block"><code></code></pre>
<p>And that’s how you do not upgrade the dependencies for your Rust crate at two thirty in
the morning on a Monday when you should be in bed trying to fall asleep
but instead wondering if quitting your job was worth the happiness it afforded you
and how you can even compare those things and if things have any value
other than how they affect your future or any potential you might ever have
which is probably the most important thing because it is the thing that you will be later
and there’s increasingly less of it so there’s less of you
and it’s sooner than you think
and it’s affected most the earlier that things happen in it
and the most meaningful way to change it is by doing meaningful things in the present
but in spite of all of that you’re always just fucking with dependencies in Rust.</p>
<p>An instructional on how to not update the dependencies for your program or
library written in Rust.</p>
2019-12-30T12:00:00-08:00https://froghat.ca/2019/05/scm-rightsSCM_RIGHTS2019-05-21T12:00:00-07:00sqwishy<p><code>SCM_RIGHTS</code> is a feature that allow file descriptors to be
duplicated across UNIX domain sockets.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a> This allows you to pass access to resources
to other processes.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>Cloudflare’s blag <a class="reference external" href="https://blog.cloudflare.com/know-your-scm_rights/">has a nice post explaining it a bit better</a> and how they used it to
support TLS 1.3.</p>
</aside>
</aside>
</aside>
<p>I wanted to try using this to do some sort of process reloading without dropping
connections and to get experience with <a class="reference external" href="https://www.rust-lang.org/">Rust</a> & <a class="reference external" href="https://github.com/tokio-rs/mio">mio</a>, a library for selecting
on file descriptors (using <code>epoll</code> in Linux).</p>
<p>The source code is available at <a class="reference external" href="https://github.com/sqwishy/scmrights-thingy">github.com/sqwishy/scmrights-thingy</a>.</p>
<section id="flow">
<h2>Flow<a class="self-link" title="link to this section" href="#flow"></a></h2>
<p>My program simultaneously listens on a UNIX socket and connects over TCP to a “chat”
server (<code>nc -l --chat</code>). Data from the TCP stream is simply copied to stdout. This is
the connection we want to hand off and keep open when restarting.</p>
<p>The UNIX socket is used to convey some application-specific control messages (since we
might want to extend this protocol to convey other sorts of information in the future)
and to pass the file descriptors. The exchange is roughly:</p>
<ol class="arabic simple">
<li><p>Program <em>A</em> accepts a UNIX connection from program <em>B</em>, which is just starting up.</p></li>
<li><p><em>B</em> sends a message requesting the file descriptors. <em>A</em> reads the message.</p></li>
<li><p><em>A</em> calls <code>sendmsg</code> to send an <code>SCM_RIGHTS</code> UNIX domain socket control message. <em>B</em>
reads the message with <code>recvmsg</code>.</p></li>
<li><p><em>B</em> sends a message to <em>A</em> telling it to quit. <em>A</em> reads the message and quits.</p></li>
<li><p>After <em>A</em> quits, <em>B</em>’s UNIX connection will be closed so <em>B</em> figures that the
exchange went well. It starts up normally but uses the file descriptor acquired to
initialize the chat client.</p></li>
</ol>
<p>Since we don’t want this little dance to be happening more than once at any given
moment, only one connection is accepted from the UNIX socket at a time.</p>
<!-- Also, this is a bit racy since the chat connection can drop/be rebuilt on *B* between
when it is sent to *A* and when it quits and stops reading. -->
</section><section id="implementation">
<h2>Implementation<a class="self-link" title="link to this section" href="#implementation"></a></h2>
<p>I made an example and it put in asciinema.</p>
<figure class="figure">
<a class="reference external image-reference" href="https://asciinema.org/a/KTE6Atz83WJrHegnW11vBpxzJ"><img alt="A thumbnail and link to a terminal session demonstrating a connection being handed off between processes." src="https://asciinema.org/a/KTE6Atz83WJrHegnW11vBpxzJ.svg" /></a>
<figcaption>In this example we start a chat server, two clients, and show the <code>scmrights-thingy</code>
program restarting without the connection dropping. Finally, a sequence of numbers
are sent quickly and the exchange happens without information being lost–mainly
because it’s not buffered or anything by example program.</figcaption>
</figure>
</section><section id="why">
<h2>Why?<a class="self-link" title="link to this section" href="#why"></a></h2>
<p>My scenario is pretty naive. In more realistic applications, I’d expect more information
than just a handle to a system resource would need to be shared; like buffers or
state about the protocol.
But these could be serialized and sent across the UNIX socket as a normal packet.</p>
<aside class="aside">
<p><em>Bonus meme:</em> systemd can do this for you using <code>sd_pid_notify_with_fds</code>. See
the documentation on <a class="reference external" href="https://www.freedesktop.org/software/systemd/man/sd_notify.html">sd_notify</a>, which also suggests sharing application state
using <a class="reference external" href="http://man7.org/linux/man-pages/man2/memfd_create.2.html">memfd_create</a></p>
</aside>
<p>So you might use this as an approach to “hot code reloading”. Maybe you want to run a
new process to run a program under a new configuration, upgrade it to run a new version,
switch between debug & release binaries, or to show off at a dinner party.</p>
</section><p>Duplicating file descriptors between processes across a UNIX socket.</p>
2019-05-21T12:00:00-07:00https://froghat.ca/2019/05/tufteTufte2019-05-06T12:00:00-07:00sqwishy<section id="introduction">
<h2>Introduction<a class="self-link" title="link to this section" href="#introduction"></a></h2>
<p>This is an example of applying <a class="reference external" href="https://edwardtufte.github.io/tufte-css/">tufte-css</a> to <abbr title="reStructuredText">reST</abbr> markup.</p>
<p>I don’t have a blog because I never write anything. But maybe I should. I’d like to
believe that if I made a blog I would write neat stuff in it. But I probably won’t.</p>
<p>I use <a class="reference external" href="http://www.sphinx-doc.org">sphinx</a> currently and could write stuff in there but I don’t really like how the
site looks so I wouldn’t feel good about anything I did there.</p>
<aside class="aside">
<p>How do normal people write things without starting every paragraph with “I”? I am so
bad at this.</p>
</aside>
<p>I was impressed by <a class="reference external" href="https://lawler.io/">lawler.io</a> but found it didn’t work well with reST. So I’ve tried
to steal things done there and from tufte-css and hack this together.</p>
<p>A number of things are supported in reST already. Like sections, footnotes<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a>, blockquotes,
epigraphs, figures. That’s one of the reasons I like reST. For this website, however,
I’ve modified how it writes HTML to use modern elements like <code><section></code> and <code><figure></code>.
This is in part to simplify creating a stylesheet using tufte-css.</p>
<aside class="footnote-list superscript">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>This is a really long footnote that exists to make sure that it fits within the
normal column of text and doesn’t overflow off to the margin on the side.</p>
</aside>
</aside>
</section><section id="epigraphs">
<h2>Epigraphs<a class="self-link" title="link to this section" href="#epigraphs"></a></h2>
<blockquote class="epigraph">
<p>Live long and prosper.</p>
<footer class="attribution">—Gandalf, <cite>The Lion, The Witch and The Wardrobe</cite></footer>
</blockquote>
<blockquote class="epigraph">
<p>Use the force, Frodo.</p>
<footer class="attribution">—Mal Reynolds, <cite>Red Dwarf</cite></footer>
</blockquote>
<p>The following is just a regular quote.</p>
<blockquote>
<p>Many were increasingly of the opinion that they’d all made a big mistake in coming
down from the trees in the first place. And some said that even the trees had been a
bad move, and that no one should ever have left the oceans.</p>
<footer class="attribution">—<cite>The Hitchhiker’s Guide to the Galaxy</cite></footer>
</blockquote>
</section><section id="sidenotes">
<h2>Sidenotes<a class="self-link" title="link to this section" href="#sidenotes"></a></h2>
<aside class="aside">
<p>This is a margin-note. It is applied to a block instead of an inline span like with
tufte-css. Creating a <code>node.inline</code> with the <code>marginnote</code> class shouldn’t be
impossible but nesting roles in reST doesn’t actually work. It’s a super bummer.</p>
</aside>
<p>Putting content on the side is fun because reading is boring so making page layouts
unpredictable can be exciting. You know; mix things up a bit. Maybe put half of the
next page at a 90 degree angle like it’s <cite>House of Leaves</cite> or something.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>This is a side-note – like a footnote but on the side. Since this <code>.aside</code> is
after the paragraph it’s used in, so it doesn’t line up right with where the footnote
reference is in the preceding text.</p>
</aside>
</aside>
</aside>
<p>So in tufte-css, marginnotes are inline content on the margin and sidenotes are
footnotes on the margin. In reST we can apply classes to blocks fairly easily. Doing an
inline marginnote would be awkward because I think you’d have to use a role and you
can’t nest those in reST and they’re used all over for <code>code</code>, <em>emphasis</em>,
<cite>citations</cite>, <strong>bold</strong>. (This is where I’m referencing<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a> a side note to
illustrate.) The nice thing about it being inline is that it’s easy to float right and
have the reference and the sidenote both horizontally aligned. And, probably more
importantly, if the footnote isn’t floated (as is the case on narrow displays) it shows
up before it’s referenced and looks silly.</p>
<p>The sidenote/<code>.aside</code> footnote is a block element sibling to the paragraph containing
its reference. The footnote is placed before the paragraph so that when it floats to the
side, its start appears level with the start of the paragraph. Which looks <em>okay</em>.<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>Otherwise it shows up like this. Which maybe you want. But could be confusing.</p>
</aside>
</aside>
</aside>
</section><section id="figures">
<h2>Figures<a class="self-link" title="link to this section" href="#figures"></a></h2>
<figure class="figure">
<a class="reference external image-reference" href="subcount-over-time-probably.png"><img alt="A line chart comparing gifted to non-gifted twitch.tv subscriptions for a channel." src="subcount-over-time-probably.png" /></a>
<figcaption>This is a caption for the adjacent figure. It shows some probably mostly accurate
numbers for subscriptions per hour to all channels on a popular video streaming
website over three days. This caption is in the margin unless your screen size is
tiny.</figcaption>
</figure>
<p>Figures also have legends or something in reST but I can’t be bothered to figure that
out right now.</p>
<p>Using <code>..class:: full-width</code> will make a figure look big. As shown below.</p>
<figure class="full-width figure">
<a class="reference external image-reference" href="full-width-figure.png"><img alt="A chart showing twitch.tv channel viewership over time for several channels that occasionally host or receive hosts from other channels." src="full-width-figure.png" /></a>
<figcaption>This figure has a caption that doesn’t fly off the side of the page. It does not take
the full width of the page at normal (not tiny) screen sizes.</figcaption>
</figure>
</section><section id="random-stuff">
<h2>Random Stuff<a class="self-link" title="link to this section" href="#random-stuff"></a></h2>
<p>This is a code block that pelican should make look cute with highlighting.</p>
<p>The stylesheet I used has really poor contrast. And this first code block has
<code>.full-width</code> because the lines are long.</p>
<pre class="code rust full-width literal-block"><code><span class="k">pub</span><span class="w"> </span><span class="k">fn</span> <span class="nf">duration_to_f64</span><span class="o"><</span><span class="n">D</span>: <span class="nc">Borrow</span><span class="o"><</span><span class="n">time</span>::<span class="n">Duration</span><span class="o">>></span><span class="p">(</span><span class="n">d</span>: <span class="nc">D</span><span class="p">)</span><span class="w"> </span>-> <span class="kt">f64</span> <span class="p">{</span><span class="w">
</span><span class="k">const</span><span class="w"> </span><span class="n">NANOS_PER_SEC</span>: <span class="kt">u32</span> <span class="o">=</span><span class="w"> </span><span class="mi">1_000_000_000</span><span class="p">;</span><span class="w">
</span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="n">borrow</span><span class="p">().</span><span class="n">as_secs</span><span class="p">()</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">(</span><span class="n">d</span><span class="p">.</span><span class="n">borrow</span><span class="p">().</span><span class="n">subsec_nanos</span><span class="p">()</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">)</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="p">(</span><span class="n">NANOS_PER_SEC</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="kt">f64</span><span class="p">)</span><span class="w">
</span><span class="p">}</span></code></pre>
<p>And if your lines aren’t inappropriately wide …</p>
<aside class="aside">
<p>… you can write something tasteless like this in the margins.</p>
</aside>
<pre class="code sql literal-block"><code><span class="k">WITH</span><span class="w"> </span><span class="k">RECURSIVE</span><span class="w"> </span><span class="n">renames</span><span class="p">(</span><span class="n">name</span><span class="p">)</span><span class="w"> </span><span class="k">AS</span><span class="w"> </span><span class="p">(</span><span class="w">
</span><span class="k">VALUES</span><span class="w"> </span><span class="p">(</span><span class="err">$</span><span class="mi">1</span><span class="p">)</span><span class="w">
</span><span class="k">UNION</span><span class="w"> </span><span class="k">ALL</span><span class="w">
</span><span class="k">SELECT</span><span class="w"> </span><span class="n">change</span><span class="p">.</span><span class="k">new</span><span class="w">
</span><span class="k">FROM</span><span class="w"> </span><span class="n">renames</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">channel_name_changes</span><span class="w"> </span><span class="n">change</span><span class="w">
</span><span class="k">WHERE</span><span class="w"> </span><span class="n">change</span><span class="p">.</span><span class="k">old</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">renames</span><span class="p">.</span><span class="n">name</span><span class="p">)</span><span class="w">
</span><span class="k">SELECT</span><span class="w"> </span><span class="n">name</span><span class="w">
</span><span class="k">FROM</span><span class="w"> </span><span class="n">renames</span><span class="p">;</span></code></pre>
<p>reST also has horizontal rules by doing <code>----</code> so here is one now.</p>
<hr class="docutils" />
<p>Wow. Wasn’t that amazing? Just incredible. What do bullet lists and enumerated
lists look like?</p>
<ul class="simple">
<li><p>8 tablespoons butter, softened (plus butter for the pan)</p></li>
<li><p>1 & 1/2 cups all-purpose flour</p></li>
<li><p>1/2 cups whole wheat flour</p></li>
<li><p>1 teaspoonsalt</p></li>
<li><p>1 & 1/2 teaspoons baking powder</p></li>
<li><p>3/4 cups sugar</p></li>
<li><p>2 eggs</p></li>
<li><p>3 bananas very ripe and mashed with a fork until smooth</p></li>
<li><p>1 teaspoon vanilla extract</p></li>
<li><p>1/2 cups chopped walnuts or pecans</p></li>
<li><p>1/2 cups shredded coconut</p></li>
</ul>
<ol class="arabic simple">
<li><p>Heat the oven to 350°F. Grease a 9x5-inch loaf pan with butter.</p></li>
<li><p>Mix together the dry ingredients. With a hand mixer, a whisk, or in the food
processor, cream the butter and beat in the eggs and bananas. Stir this mixture into
the dry ingredients, just enough to combine (it’s okay if there are lumps). Gently
stir in the vanilla, nuts, and coconut.</p></li>
<li><p>Pour the batter into the loaf pan and bake for 45 to 60 minutes, until nicely
browned. A toothpick inserted in the center of the bread will come out fairly clean
when done, but because of the bananas this bread will remain moister than most. Do
not overcook. Cool on a rack for 15 minutes before removing from the pan.</p></li>
</ol>
</section><p>Nice style sheets & fonts.</p>
2019-05-06T12:00:00-07:00https://froghat.ca/2019/05/pulseaudioPulseAudio2019-05-11T12:00:00-07:00sqwishy<p>Sometimes, I stream a recording of part of my desktop to strangers on the internet for
some reason.</p>
<p>Often, the video captures just one monitor or one application and so I only want to
stream audio from one or two applications. PulseAudio has a bunch of modules that let
us configure it. Using these modules, we can send audio from selected applications to a
dedicated sink, use a monitor of the sink as an audio source for the stream, and also
listen to the sink so that we still hear the audio being streamed.</p>
<hr class="docutils" />
<p>To start, load the <a class="reference external" href="https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#index3h3">null-sink module</a>, creating a new sink to send sound to.</p>
<pre class="code bash literal-block"><code>pactl<span class="w"> </span>load-module<span class="w"> </span>module-null-sink<span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="nv">sink_name</span><span class="o">=</span>strim<span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="nv">sink_properties</span><span class="o">=</span>device.description<span class="o">=</span><span class="s2">"StrimSink"</span></code></pre>
<p>On success, that’ll output the module index for the sink. (This can be used to remove it
with <code>pactl unload-module</code>.)</p>
<p>If we open <code>pavucontrol</code>, we should see our sink under “Output Devices” and a monitor
for it under the “Input Devices” tab which we’ll user later.
The <code>device.description</code> we gave when we loaded the module is what things tend to show
when referring to our sink. If we didn’t set one, it’ll show up as “Null Output” or
something.</p>
<p>At this point we can configure things to play to or record from this sink.</p>
<aside class="aside">
<figure class="figure">
<a class="reference external image-reference" href="pa-obs.png"><img alt="A screenshot of OBS's settings dialog's audio section where OBS audio devices can be mapped to pulse audio sinks." src="pa-obs.png" /></a>
<figcaption>If we’re using <a class="reference external" href="https://obsproject.com/">OBS Studio</a>, we can configure it to capture audio from our sink.</figcaption>
</figure>
</aside>
<aside class="aside">
<figure class="figure">
<a class="reference external image-reference" href="pa-pavucontrol.png"><img alt="An audio source in pavucontrol reads "ALSA plug-in [factorio]: ALSA Playback on" next to a combobox for the output which reads "StrimSink"." src="pa-pavucontrol.png" /></a>
<figcaption>If we have an already running program connected to PulseAudio, we can change the sink
it outputs to in the “Playback” tab of <code>pavucontrol</code>.</figcaption>
</figure>
</aside>
<p>You can start a program that will output audio to that sink by setting the <code>PULSE_SINK</code>
environment variable:</p>
<pre class="code bash literal-block"><code>env<span class="w"> </span><span class="nv">PULSE_SINK</span><span class="o">=</span>strim<span class="w"> </span>paplay<span class="w"> </span>/usr/share/sounds/freedesktop/stereo/bell.oga</code></pre>
<p>At this point we can’t hear programs playing sound into the sink. But we should at least
see a short blip on the volume monitor for the sink under the “Output Devices” tab of
<code>pavucontrol</code>.</p>
<p>Technically, this is all we need to separate sound from our applications. But usually
when I’m streaming audio I also want to be able to hear it.
PulseAudio was kind enough to make a monitor of our sink (visible under “Input Devices”).
We can use a <a class="reference external" href="https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#index57h3">loopback module</a> to listen to it and specify that monitor as the input
source. We could also specify a destination sink but, if we omit that, it should use the
normal sink that applications play sound to by default; which is probably what we want.</p>
<pre class="code bash literal-block"><code>pactl<span class="w"> </span>load-module<span class="w"> </span>module-loopback<span class="w"> </span><span class="nv">source</span><span class="o">=</span>strim.monitor<span class="w"> </span><span class="nv">latency_msec</span><span class="o">=</span><span class="m">1</span></code></pre>
<p>On success, we should be able to hear things that play into our null sink–like that
<code>paplay</code> command mentioned earlier. This loopback module has its own volume controls and
can be muted; it shows up under the “Recording” tab in <code>pavucontrol</code>.</p>
<p>Broadcasting or recording audio from only select applications in a Linux
desktop.</p>
2019-05-11T12:00:00-07:00https://froghat.ca/2019/06/ngx-http-map-modulengx_http_map_module2019-06-22T12:00:00-07:00sqwishy<section id="overview">
<h2>Overview<a class="self-link" title="link to this section" href="#overview"></a></h2>
<p>I like to serve files.
Sometimes, I don’t want their URLs to be as guessable as the file names.
We can use a <a class="reference external" href="https://nginx.org/en/docs/http/ngx_http_map_module.html">map in nginx</a> to serve files under paths that differ from their filenames
and the <a class="reference external" href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition">Content-Disposition</a> HTTP header
to include their original filename in the response.</p>
<p>This is roughly the approach I use:</p>
<ol class="arabic simple">
<li><p>We see a request for <code>/top-secret</code>.</p></li>
<li><p>We read a mapping and find <code>/top-secret "cat-pictures.tar.gz";</code>.</p></li>
<li><p>We add the header <code>Content-Disposition: inline; filename="cat-pictures.tar.gz"</code></p></li>
<li><p>We serve the file named “cat-pictures.tar.gz” from disk.</p></li>
</ol>
<p>This means I can visit <code>/top-secret</code>
and, if my browser asks me to save the file because it can’t display it
or because I told it to,
it can use “cat-pictures.tar.gz” as the default filename instead of “top-secret”.</p>
<aside class="aside">
<p>This is mentioned in the nginx docs;
but keep in mind that, when using strings
(as opposed to regular expression) for source values
in each association of the map,
nginx will ignore case when matching.
So <code>/top-secret</code> and <code>/tOp-SeCrEt</code> both work the same.</p>
</aside>
<p>A downside to this is that updating the mapping requires an nginx reload.
This is a bit inconvenient for me, as it requires sudo,
so I’ve just thrown
<code>%wheel ALL=NOPASSWD: /usr/sbin/nginx</code> into sudoers.
This is probably bad hygiene.
I think sudoers can discriminate on program arguments somehow
so you could only allow <code>nginx -t</code> and <code>nginx -s reload</code> or something
and that might be better.</p>
</section><section id="configuring-nginx">
<h2>Configuring nginx<a class="self-link" title="link to this section" href="#configuring-nginx"></a></h2>
<p>A very simple nginx configuration I use for this
looks roughly like the following:</p>
<pre class="code nginx full-width literal-block"><code><span class="k">map</span><span class="w"> </span><span class="nv">$uri</span><span class="w"> </span><span class="nv">$filename</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="kn">include</span><span class="w"> </span><span class="s">/srv/files.froghat.ca/map</span><span class="p">;</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="k">server</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="kn">listen</span><span class="w"> </span><span class="mi">443</span><span class="w"> </span><span class="s">ssl</span><span class="w"> </span><span class="s">http2</span><span class="p">;</span><span class="w">
</span><span class="kn">listen</span><span class="w"> </span><span class="s">[::]:443</span><span class="w"> </span><span class="s">ssl</span><span class="w"> </span><span class="s">http2</span><span class="p">;</span><span class="w">
</span><span class="kn">server_name</span><span class="w"> </span><span class="s">files.froghat.ca</span><span class="p">;</span><span class="w">
</span><span class="kn">root</span><span class="w"> </span><span class="s">/srv/files.froghat.ca/files</span><span class="p">;</span><span class="w">
</span><span class="kn">location</span><span class="w"> </span><span class="s">/</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="c1"># Ignore a trailing slash...
</span><span class="w"> </span><span class="kn">rewrite</span><span class="w"> </span><span class="s">^/(.*)/</span>$<span class="w"> </span><span class="s">/</span><span class="nv">$1</span><span class="w"> </span><span class="s">last</span><span class="p">;</span><span class="w">
</span><span class="c1"># I don't remember why we need this...
</span><span class="w"> </span><span class="kn">set</span><span class="w"> </span><span class="nv">$filename_</span><span class="w"> </span><span class="nv">$filename</span><span class="p">;</span><span class="w">
</span><span class="kn">add_header</span><span class="w"> </span><span class="s">Content-Disposition</span><span class="w"> </span><span class="s">'inline</span><span class="p">;</span><span class="w"> </span><span class="kn">filename="$filename_"'</span><span class="p">;</span><span class="w">
</span><span class="kn">try_files</span><span class="w"> </span><span class="s">/</span><span class="nv">$filename</span><span class="w"> </span><span class="p">=</span><span class="mi">404</span><span class="p">;</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span></code></pre>
<p>The <code>set $filename_ $filename;</code> line should stand out to you as
a clear indication that I have no idea what I’m doing
and that there are much better blogs for you to be reading.</p>
<p>I wrote that a long time ago and,
from what I remember,
it was necessary so that <code>add_header</code>
would properly interpolate the filename into the string or something.
I checked recently
and nothing bad seems to happen without that <code>set</code> line,
so either it was fixed at some point
or I’m just making stuff up.
You can probably omit the line and write <code>filename="$filename"</code> instead.</p>
<p>In addition to the nginx configuration above,
there is a directory at
<code>/srv/files.froghat.ca/files</code>
where I keep the files to be served
under their original unlisted filenames.
And the file containing the mapping at <code>/srv/files.froghat.ca/map</code></p>
<pre class="code shell full-width literal-block"><code>$<span class="w"> </span>cat<span class="w"> </span>/srv/files.froghat.ca/files/bar<span class="w">
</span>it<span class="w"> </span>works<span class="w">
</span>$<span class="w"> </span>head<span class="w"> </span>-n1<span class="w"> </span>/srv/files.froghat.ca/map<span class="w">
</span>/foo<span class="w"> </span><span class="s2">"bar"</span><span class="p">;</span><span class="w">
</span>$<span class="w"> </span>curl<span class="w"> </span>-i<span class="w"> </span>https://files.froghat.ca/foo<span class="w">
</span>HTTP/2<span class="w"> </span><span class="m">200</span><span class="w">
</span>...<span class="w">
</span>content-type:<span class="w"> </span>application/octet-stream<span class="w">
</span>content-length:<span class="w"> </span><span class="m">9</span><span class="w">
</span>content-disposition:<span class="w"> </span>inline<span class="p">;</span><span class="w"> </span><span class="nv">filename</span><span class="o">=</span><span class="s2">"bar"</span><span class="w">
</span>it<span class="w"> </span>works</code></pre>
<p>In the response, we get the appropriate content headers,
including the desired Content-Disposition.</p>
<p>Again, keep in mind that, every time the mapping file is updated,
nginx needs to be reloaded.</p>
</section><section id="building-the-map">
<h2>Building the map<a class="self-link" title="link to this section" href="#building-the-map"></a></h2>
<p>This is entirely subject to your desires and use case.
If you want the URL to be known
when the filename or content is known,
you could use a hash on one of those things.
Otherwise, you might salt the input to the hash function.</p>
<aside class="aside">
<p>I won’t tell you exactly what I do
because I’m dumb and you shouldn’t copy me.</p>
</aside>
<p>I imagine using a salt and hashing the filename
makes the most sense generally.
This assumes that if the content of a file changes, but not the filename,
then you want the associated URL to remain the same.
If, for some reason down the line,
you want to reproduce a path association,
the filename will probably more stable than the content.
So you’ll have a better chance to get the same paths you had earlier
if your URLs are filename-dependent rather than content-dependent.</p>
<p>But I mean it’s up to you and what makes you happy.</p>
<p>Now, base64 encoded bytes can be URL safe
and not that much longer than the bytes they encode.
But I don’t find them especially easy
for humans to remember or communicate on their own.
So when that’s something I care about,
I’ll often use <a class="reference external" href="https://github.com/singpolyma/mnemonicode">mnemonicode</a>.</p>
<blockquote>
<p>The encoding converts 32 bits of data into 3 words from a vocabulary of 1626 words.
The words have been chosen to be easy to understand over the phone and recognizable
internationally as much as possible.</p>
<footer class="attribution">—<a class="reference external" href="http://web.archive.org/web/20101031205747/http://www.tothink.com/mnemonic/">http://web.archive.org/web/20101031205747/http://www.tothink.com/mnemonic/</a></footer>
</blockquote>
<p>For example:</p>
<pre class="code shell literal-block"><code>$<span class="w"> </span><span class="nb">echo</span><span class="w"> </span>-n<span class="w"> </span>derp<span class="w"> </span><span class="p">|</span><span class="w"> </span>mnencode<span class="w">
</span>Wordlist<span class="w"> </span>ver<span class="w"> </span><span class="m">0</span>.7<span class="w">
</span>rhino<span class="w"> </span>friday<span class="w"> </span>texas</code></pre>
<p>When I use this, I use sed to capitalize words and remove spaces.</p>
<aside class="aside">
<p>But you could hyphenate instead if you want.</p>
</aside>
<pre class="code shell literal-block"><code>$<span class="w"> </span><span class="nb">echo</span><span class="w"> </span>-n<span class="w"> </span>derp<span class="w"> </span><span class="p">|</span><span class="se">\
</span><span class="w"> </span>mnencode<span class="w"> </span><span class="m">2</span>>/dev/null<span class="w"> </span><span class="p">|</span><span class="se">\
</span><span class="w"> </span>sed<span class="w"> </span><span class="s1">'s:\S*:\u\0:g'</span><span class="w"> </span><span class="p">|</span><span class="se">\
</span><span class="w"> </span>sed<span class="w"> </span><span class="s1">'s:[ \.]::g'</span><span class="w">
</span>RhinoFridayTexas</code></pre>
<p>Again, nginx will match on this in a case-insensitive manner.
This is handy because that’s one thing fewer
that can go wrong if the URL is being
communicated verbally or whatever.</p>
<p>Finally,
hash digests can end up being pretty long when encoded with <code>mnencode</code>.
SHA-256 has a digest size of 256 bits, which is 24 human words
(as opposed to computer memory words).
I typically truncate them, but I’m not sure that is typically an acceptable thing to do.
So you maybe shouldn’t do that.</p>
<p><a class="reference external" href="https://en.wikipedia.org/wiki/SHA-3">SHA-3</a> has a thing called SHAKE that is apparently designed
for variable length outputs.</p>
<pre class="code shell full-width literal-block"><code>$<span class="w"> </span>python3<span class="w"> </span>-c<span class="w"> </span><span class="s1">'import sys, hashlib
hash = hashlib.shake_128(b"derp")
sys.stdout.buffer.write(hash.digest(4))'</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>mnencode<span class="w">
</span>Wordlist<span class="w"> </span>ver<span class="w"> </span><span class="m">0</span>.7<span class="w">
</span>belgium<span class="w"> </span>bombay<span class="w"> </span>puzzle</code></pre>
<p>You could also come up with your own encoding
that uses a larger vocabulary or something.
Three words from a vocabulary of 6,981,463,658,332
is enough to cover 128 bits. That seems doable.</p>
</section><p>A fun way to serve files with nginx under vaguely mnemonic URLs.</p>
2019-06-22T12:00:00-07:00https://froghat.ca/2020/11/untitled-guest-postUntitled Guest Post2020-11-20T12:00:00-08:00John Field<aside class="aside">
<p>This post was written by John Field for me to put on my website.</p>
<p>I like his videos on <a class="reference external" href="https://www.youtube.com/c/johnfieldshow">his channel on YouTube</a>.</p>
<p>Watch his video on death; <a class="reference external" href="https://www.youtube.com/watch?v=3-kBvirpvrE">I Miss My Dumb Dead Cat</a>.</p>
</aside>
<div class="iframe-jail"><div class="iframe-jail-cell"><iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/xKowp5s1_No" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></div></div><p>The above video, where I quickly explain why the USA Network logo is perverted, is the most popular video I ever made. It’s less than a minute long and probably took me less than 20 minutes to shoot from set-up to wrap-up.</p>
<p>The whole idea was an absolute after-thought; at the time, I was making a community access TV-show, the <a class="reference external" href="https://www.youtube.com/channel/UC9Nq1knuHMan9WCk035QFgw?view_as=subscriber">John Field Show</a>. We modeled the show loosely around Sesame Street (“Sesame Street for adults” was a big goal that we worked around) specifically the show’s structure of being a variety show that was built more around informing the audience than making them laugh, so for us that meant more surreal or goofy segments that involved interviews, or filming weird situations outdoors and experimental shorts.</p>
<p>The way we would film, when we had to do studio segments, was we’d book two hours at the local community access studios (that was the limit) and set-up a three-point lighting kit and try to build a couple of segments off of a couple of loose guidelines. Sometimes that meant that just showing up with a bunch of props, throwing them on the floor and seeing what ideas pop out to us. Sometimes that meant more pre-determined stuff, like guest interviews or a monologue. Sometimes we would think up an idea the day of and quickly try to shoot it. We tried to avoid traditional sketches, I think just because I think…I dunno…I guess part of it was that traditional sketches was overplayed online.</p>
<p>But last year, we were going to do the shoot, and I was watching the USA network and, while looking at the logo, I was thinking as a kid how a friend explained to me that the logo looked like a dolphin sucking a man off, and I thought it would be a fun segment to shoot for the show. So, the next morning, before the shoot, I ran off to the local Office Max, got it printed on blueprint paper, and came to the studios and we shot the darn thing.</p>
<p>It’s a short video, and there’s not a lot to say about it. I think the only thing of real design to it is that I figure I’d post it on social media, so most people who’d watch it probably wouldn’t know who I am and so I would at least introduce myself. I think in videos online, it’s really important to provide context quickly. It’s something I learned from doing stand-up - people don’t know you, so you have to let them know who you are and why it is important to listen, otherwise people don’t really connect. If you’re doing a magic trick, you first have to tell the audience that you’re going to make a coin disappear, otherwise they wouldn’t know what to look for when you do the trick.</p>
<p>Similarly, I ended with telling people to get back to “the important thing you were doing” just as a nod to the fact that I’m probably a part of that never ending swath of content we all consume in a day. The whole arc of the video is I introduce myself, I say my stupid thing, and then I let you live your life.</p>
<p>So, I shot it, decided I enjoyed it enough to make it it’s own clip and posted it online and within minutes it got thousands of views. Before I even had a chance to post it anywhere, someone posted it on Reddit and from there it got to the top of r/videos…and then it spread to a bunch of other social media after that. I think the weirdest one was Ebaums world - I think of the 300,000 views, 200,000 came from that. Some insane number. There was a while where that link from Ebaum’s world accounted for 50% of my total views on my entire channel, which is so weird because I had no clue people were using the website after 2005.</p>
<p>Anyways, because of the video I ended up becoming a youtube partner and getting paid for my videos. Not a lot, but still it’s nice to be paid for your efforts. I think it’s not the best thing I ever made, but opened up opportunities for me and let people see the other work that I am a lot more proud of. I hope, at the very least, it changed how people look at the USA Logo.</p>
<p>A guest post by <a class="reference external" href="https://www.youtube.com/c/johnfieldshow">John Field</a> about a video he produced.</p>
2020-11-20T12:00:00-08:00https://froghat.ca/2020/12/pizza-part-iiPizza: Part ii2020-12-08T12:00:00-08:00sqwishy<p>In a <a class="reference external" href="/2020/06/pizza">previous post</a>, I wrote some instructions on how I cook pizza in my conventional electric oven. Those instructions were over-complicated. More concise steps are below.</p>
<p>The actual recipe (for three 300g-ish doughs) is the same and I’ll copy it here
for convenience.</p>
<aside class="aside">
<p>This is a fairly low hydration dough. I think it’s great but you might not, depending on your tastes, the flour you use, and how you bake the dough.</p>
</aside>
<table>
<tbody>
<tr><td><p>Water</p></td>
<td><p>200g</p></td>
</tr>
<tr><td><p>Salt</p></td>
<td><p>14g</p></td>
</tr>
<tr><td><p>Levain starter</p></td>
<td><p>250g</p></td>
</tr>
<tr><td><p>White flour</p></td>
<td><p>400g</p></td>
</tr>
</tbody>
</table>
<p>At some point before baking, probably while preheating the stone, shape your dough ball into pizza. I do this by stretching the dough a bit and letting it rest and then repeating that until I’m happy with the shape.
<em>Don’t let the dough sit out for too long or it will dry out and get crusty and kinda weird.</em></p>
<ol class="arabic">
<li><p>Put your pizza stone in the oven near the broiler.</p>
<p>I place mine on the 2nd highest rack; the about 11cm
(4.3”) under the broiler.</p>
</li>
<li><p>Try to get the stone as hot as possible without burning your house down.</p>
<p>I use my broiler on its highest setting (260°C) and it takes about twenty-five or thirty minutes.</p>
<p>My broiler will shut off sporadically, as the stone gets above 350°C
(measured with an infrared thermometer) but it will get hotter with patience.
<em>You just have to believe in yourself.</em></p>
</li>
<li><p>Once the stone is hot enough, turn <em>off</em> the broiler and slide the pizza on the stone.</p>
<p>It is important that the pizza is <em>centered</em> under the broiler.</p>
</li>
<li><p>With the broiler off, wait two minutes. Turn the broiler on again.</p></li>
<li><p>With the broiler on, wait two and a half or three minutes; or until the pizza looks done.</p>
<p>The exact time depends a bit on how burnt you want it and how long it takes
for the broiler to do its thing. Just watch the pizza until you’re happy.</p>
</li>
</ol>
<aside class="aside">
<figure class="figure">
<a class="reference external image-reference" href="20201203_0014.jpg"><img alt="burnt pizza with burnt pesto, mozzarella, pickled banana peppers, and olive oil" src="20201203_0014.jpg" /></a>
<figcaption>burnt pizza with burnt pesto, mozzarella, pickled banana peppers, and olive oil</figcaption>
</figure>
</aside>
<figure class="figure">
<a class="reference external image-reference" href="20201203_0022.jpg"><img alt="side-profile of a couple pizza slices" src="20201203_0022.jpg" /></a>
<figcaption>side-profile of a couple pizza slices</figcaption>
</figure>
<ol class="arabic simple" start="6">
<li><p>Take the pizza out. Admire it. Then overeat to cope with the emotional despair that comes from knowing that nobody looks at you with the desire & lust that you have when you look at your pizza.</p></li>
</ol>
<p>Condensed follow-up on my pizza incinerating regimen.</p>
2020-12-08T12:00:00-08:00https://froghat.ca/2020/05/dont-woofDon’t Woof2020-05-25T12:00:00-07:00sqwishy<p>Some time last December I discovered a database called <a class="reference external" href="https://www.datomic.com/">Datomic</a>. Datomic has an
interesting <a class="reference external" href="https://docs.datomic.com/cloud/whatis/data-model.html">data model</a> where each value exists in a <em>entity</em>, <em>attribute</em>, <em>value</em>,
<em>transaction</em>, <em>operation</em> five-tuple known as a Datom.</p>
<p>The <em>transaction</em> and <em>operation</em> allow Datomic to record a history of database changes.
While this is cool, it won’t be what I’ll be talking about here.</p>
<p>The <em>entity</em>, <em>attribute</em>, <em>value</em> structure allows describing objects by grouping
multiple attributions to the same entity. Through this, a single entity can receive many
attributions, even those belonging to different types or object categories.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>For instance, the same entity could have values for :animal/name and :pet/name.
Allowing that entity to <em>mean</em> different things in different places.</p>
</aside>
</aside>
</aside>
<p>This is an (somewhat embellished) example of <em>entity</em>, <em>attribute</em>, <em>value</em> entries from
the Datomic documentation:</p>
<blockquote>
<table>
<thead>
<tr><th class="head"><p>E</p></th>
<th class="head"><p>A</p></th>
<th class="head"><p>V</p></th>
</tr>
</thead>
<tbody>
<tr><td><p>7</p></td>
<td><p>:inv/id</p></td>
<td><p>“SKU-1234”</p></td>
</tr>
<tr><td><p>7</p></td>
<td><p>:inv/color</p></td>
<td><p>:green</p></td>
</tr>
<tr><td><p>8</p></td>
<td><p>:inv/id</p></td>
<td><p>“SKU-5678”</p></td>
</tr>
<tr><td><p>8</p></td>
<td><p>:inv/watts</p></td>
<td><p>60000</p></td>
</tr>
<tr><td><p>8</p></td>
<td><p>:doc/url</p></td>
<td><p>“http:…”</p></td>
</tr>
</tbody>
</table>
<footer class="attribution">—<a class="reference external" href="https://docs.datomic.com/cloud/whatis/data-model.html#universal">Datomic Data Model – Universal Schema</a></footer>
</blockquote>
<p>Datomic has a few important indexes around the <em>entity</em>, <em>attribute</em>, <em>value</em> structure.
Again, quoting their documentation:</p>
<blockquote>
<p>Datomic’s indexes automatically support multiple styles of data access:
row-oriented, column-oriented, document-oriented, K/V, and graph:</p>
<ul class="simple">
<li><p>The E-leading index EAVT supports efficient queries for details about particular
entities, analogous to a traditional relational database.</p></li>
<li><p>The A-leading index AEVT supports efficient queries for a single attribute across
all entities, analogous to a column store.</p></li>
<li><p>The V-leading index VAET supports efficient queries for references between
entities, analogous to a graph database.</p></li>
<li><p>The combination of EAVT and VAET supports arbitrary nested navigation among
entities. This is like a document database but vastly more flexible. You can
navigate in any direction at any time, instead of being limited to the containment
hierarchy you selected when storing the document.</p></li>
</ul>
<footer class="attribution">—<a class="reference external" href="https://docs.datomic.com/cloud/whatis/data-model.html#indexes">Datomic Data Model – Indexes</a></footer>
</blockquote>
<p>For some reason those paragraphs do not mention the AVET index, but that’s also one of
the four important indexes that make things work. To review:</p>
<blockquote>
<table>
<thead>
<tr><th class="head"><p>Index</p></th>
<th class="head"><p>Sort Order</p></th>
</tr>
</thead>
<tbody>
<tr><td><p>EAVT</p></td>
<td><p>entity / attribute / value / tx</p></td>
</tr>
<tr><td><p>AEVT</p></td>
<td><p>attribute / entity / value / tx</p></td>
</tr>
<tr><td><p>AVET</p></td>
<td><p>attribute / value / entity / tx</p></td>
</tr>
<tr><td><p>VAET</p></td>
<td><p>value / attribute / entity / tx</p></td>
</tr>
</tbody>
</table>
<footer class="attribution">—<a class="reference external" href="https://docs.datomic.com/cloud/query/raw-index-access.html#indexes">Raw Index Access – Datomic Indexes</a></footer>
</blockquote>
<p>So, Datomic lets you throw a bunch of <em>entity</em>, <em>attribute</em>, <em>value</em> pairs
into a big soup and index them allowing efficient lookup of stuff like attributes on a
given entity and entities with a given attribute.</p>
<p>What I think is very cool about the database and the <em>entity</em>, <em>attribute</em>, <em>value</em>
model are the implications for querying. Datomic uses “<a class="reference external" href="https://docs.datomic.com/cloud/whatis/data-model.html#datalog">an extended form of
Datalog</a>” to form queries. And what you get from this is a query language that makes a
lot of use of pattern matching.</p>
<section id="pattern-matching">
<h2>Pattern Matching<a class="self-link" title="link to this section" href="#pattern-matching"></a></h2>
<p>I’d roughly summarize pattern matching as being when your assertions resemble your
interrogatives. That is, the thing that you write to ask a question
would look like what you could write to state the fact of the answer if you knew it.</p>
<p>I think algebra is a really accessible reference point for this.
Suppose we solve for <em>x</em> in the following expression <code>3x = 12</code>. The fact of what <em>x</em> is
can be demonstrated by “plugging in” its value and testing that the
expression is true. In this example, the number <em>4</em> is the only substitution for <em>x</em>
(out of all the numbers) where the expression holds.</p>
<p>Applying this to something structured, like our datoms,
we might construct similar expressions to “solve for <em>x</em>” in
an <em>entity</em>, <em>attribute</em>, <em>value</em> tuple such as:</p>
<p><code>x :pet/name "Garfield"</code></p>
<p>Here, we’re asking what values could <em>x</em> be such that this expression matches an entry
in our database. Or, what entities are attributed the <em>:pet/name</em> “Garfield”.</p>
<p>The documentation for Datomic explains their syntax for this:</p>
<blockquote>
<p>This query limits datoms to <code>:artist/name</code> “The Beatles”, and returns the entity ids
for such results:</p>
<pre class="code clojure literal-block"><code><span class="p">[</span><span class="ss">:find</span><span class="w"> </span><span class="nv">?e</span><span class="w">
</span><span class="ss">:where</span><span class="w"> </span><span class="p">[</span><span class="nv">?e</span><span class="w"> </span><span class="ss">:artist/name</span><span class="w"> </span><span class="s">"The Beatles"</span><span class="p">]]</span></code></pre>
<footer class="attribution">—<a class="reference external" href="https://docs.datomic.com/cloud/query/query-data-reference.html#query-example">Query Reference – Query Example</a></footer>
</blockquote>
<p>We can build up more interesting queries by adding more patterns and relating them
through variables.</p>
<blockquote>
<p>Unification occurs when a variable appears in more than one data pattern. In the
following query, <em>?e</em> appears twice:</p>
<pre class="code clojure literal-block"><code><span class="c1">;;which 42-year-olds like what?</span><span class="w">
</span><span class="p">[</span><span class="ss">:find</span><span class="w"> </span><span class="nv">?e</span><span class="w"> </span><span class="nv">?x</span><span class="w">
</span><span class="ss">:where</span><span class="w"> </span><span class="p">[</span><span class="nv">?e</span><span class="w"> </span><span class="ss">:age</span><span class="w"> </span><span class="mi">42</span><span class="p">]</span><span class="w"> </span><span class="p">[</span><span class="nv">?e</span><span class="w"> </span><span class="ss">:likes</span><span class="w"> </span><span class="nv">?x</span><span class="p">]]</span></code></pre>
<p>Matches for the variable <em>?e</em> must <strong>unify</strong>, i.e. represent the same value in every
clause in order to satisfy the set of clauses. So a matching <em>?e</em> must have both <em>:age</em>
42 and <em>:likes</em> for some <em>?x</em>:</p>
<pre class="code clojure literal-block"><code><span class="p">[[</span><span class="nv">fred</span><span class="w"> </span><span class="nv">pizza</span><span class="p">]</span>,<span class="w"> </span><span class="p">[</span><span class="nv">ethel</span><span class="w"> </span><span class="nv">sushi</span><span class="p">]]</span></code></pre>
<footer class="attribution">—<a class="reference external" href="https://docs.datomic.com/cloud/query/query-executing.html#unification">Executing Queries – Unification</a></footer>
</blockquote>
<p>This paragraph is intended to
comfortably guide this section into the following one, giving the appearance of a
cohesive thought structure.</p>
</section><section id="is-the-man-who-is-tall-happy">
<h2>Is the Man Who Is Tall Happy?<a class="self-link" title="link to this section" href="#is-the-man-who-is-tall-happy"></a></h2>
<p><em>Warning: I am not a word scientist. I basicallly don’t cite anything and I don’t know
what I am talking about and you should not listen to me.</em></p>
<p>I happened to watch a
film titled <a class="reference external" href="https://en.wikipedia.org/wiki/Is_the_Man_Who_Is_Tall_Happy%3F">Is the Man Who Is Tall Happy?</a> where a French student, Michel Gondry,
interviews Noam Chomsky about linguistics.</p>
<p>The student asks Chomsky to explain a point made in one of Chomsky’s books and
Chomsky replies:</p>
<blockquote>
<p>Take the sentence that you gave me, “The man who is tall is happy”. If you want to
form a question from that, you take the word “is” and put it in the front. So, “Is
the man who is tall happy?” Right? That’s the question.</p>
<p>You don’t take the first occurrence of “is”. You don’t take the closest one to the
front and say, “Is the man who tall is happy?” That’s gibberish.</p>
<p>Why doesn’t the child do the simple thing? Take the first occurrence of “is” and put
it in the front? Computationally, that’s much easier than finding the <em>main</em>
occurrence which requires knowing the phrases and so on.</p>
<p>It’s the same principle in all languages. So why?</p>
</blockquote>
<p>A structure is shown to illustrate this point. It resembles the following:</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="themanwhoistallishappy-dm.svg">
<img alt="A very rough sentence diagram in a tree showing the distance each "is" is to the root of the sentence structure." src="themanwhoistallishappy.svg" />
</picture>
<p>In the film, we trace a path from the beginning of the structure (at the top) down to
each “is”. The point is made that <em>more</em> grammatical structure is traversed to get to
the <span class="hl-yellow">“first” “is”</span> compared to the <span class="hl-purple">“second” “is”</span>. In terms of
the elements of the diagram, more nodes are visited and more lines are drawn in the
<span class="hl-yellow">first</span> path.<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>In this diagram, the words in the sentence are <em>not</em> forced into a row. Consequently, the nesting is emphasized.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="themanwhoistallishappy2-dm.svg">
<img alt="Same as the diagram above but with the tree relaxed to pronounce depth of sentence structure." src="themanwhoistallishappy2.svg" />
</picture>
</aside>
</aside>
</aside>
<p>Chomsky continues:</p>
<blockquote>
<p>Structurally speaking, <span class="hl-purple">this</span> <em>[pointing to the “is” in “is happy”]</em> is the closest to the
front. Linearly, <span class="hl-yellow">this</span> <em>[pointing to the “is” in “is tall”]</em> is the closest to the front.</p>
<p>Now the question is: Why do you use structural proximity and not linear proximity?</p>
<p>The child picks structural closeness because that’s a property of language
probably genetically determined.</p>
</blockquote>
<p>Chomsky’s argument is part of a larger topic that I don’t understand called <a class="reference external" href="https://en.wikipedia.org/wiki/Universal_grammar">universal
grammar</a> or something.</p>
<p>The gist, as far as I can tell, is that people have intuitions about language
that convey an advantage toward learning and mastery of languages that employ
the features that our intuition rely on.</p>
<p>In contrast to a behaviourist model, this proposition suggests that not all possible languages can be
learned with equal fluency by a particular person given controlled stimuli.</p>
<aside class="aside">
<p><em>Again, I don’t know what I’m talking about. This is not my domain and you shouldn’t
believe anything I’ve written here.</em></p>
</aside>
<p>The influence that genetics has on our intuition may be contentious. But it’s a stronger
claim than what is needed to suggest, simply, if the utility of natural and
constructed languages are subject to our intuitions and dispositions, might the
effect extend to the languages we use in computing?</p>
<p>Broadly, why is language the way it is and not another way?</p>
<p>Does the answer to this question have implications for how we can design utility in
either information languages or query languages in software?</p>
<p>How can we know or measure the effect?<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>I’m probably not going to wake up
tomorrow and read on the news that the <em>real</em> reason for the collapse of the Roman
Empire was that nobody could remember how to use a window function in their SQL query
and their bureaucracy crumbled under its own weight.</p>
<p>Or maybe I will. I guess weirder stuff has been news.</p>
</aside>
</aside>
</aside>
<p>To me, pattern matching has an appeal that I can’t really explain. Maybe it has to
do with how I happened to be taught algebra or symbolic logic or something
and it’s not shared with, or generally applicable to, others folks.
On the other hand, maybe it has to do with the way my brain, and brains generally, deal with language.
I think that would be interesting if that were the case.</p>
<p>My guess, in my capacity as
someone who doesn’t know what they’re talking about, is that there <em>are</em> important ways
in how our brain works and how language works that make certain designs more effective
than others, and that the difference between imperative and declarative programming
styles is an example of this.</p>
</section><section id="owoof">
<h2>Owoof<a class="self-link" title="link to this section" href="#owoof"></a></h2>
<p>Recently, I started making a program that could store <em>entity</em>,
<em>attribute</em>, <em>value</em> tuples into a <a class="reference external" href="https://www.sqlite.org/">SQLite</a> database and query them with a syntax
resembling Datalog.</p>
<p>Like the Datomic examples from earlier, queries should resemble a sequence of <em>entity</em>,
<em>attribute</em>, <em>value</em> tuples where each element can optionally be a variable binding.</p>
<pre class="code bash literal-block"><code>?b<span class="w"> </span>:book/title<span class="w"> </span><span class="s2">"The Complete Calvin and Hobbes"</span><span class="w">
</span>?r<span class="w"> </span>:review/book<span class="w"> </span>?b<span class="w">
</span>?r<span class="w"> </span>:review/user<span class="w"> </span>?u<span class="w">
</span>?r<span class="w"> </span>:review/score<span class="w"> </span><span class="m">1</span></code></pre>
<p>There were a couple ways to think about these queries in terms of SQL but I will just
talk about the one I implemented.</p>
<p>It’s easiest to explain this as a graph where the nodes are sets of datoms and edges are
constraints that relate the sets of datoms together. In terms of SQL, we are
selecting a datoms table one or more times and joining it with itself.</p>
<picture>
<source media="(prefers-color-scheme: dark)" srcset="owoof-projection-dm.svg">
<img alt="Four boxes with three cells each, labeled "e", "a", and "v". Lines connect cells between boxes or to speific values." src="owoof-projection.svg" />
</picture>
<p>There are four groups of boxes in the figure above, each
correspond to an aliased use of the datoms table in the <em>FROM</em> clause of a <em>SELECT</em>
statement and each edge is an expression in a conjunction in the <em>WHERE</em> clause.
The lines between boxes
correspond to equality constraints between aliased selections of the datoms table. This
is how the table is joined with itself.</p>
<p>Below is an example of a SQL query generated with this approach. For simplicity,
what is shown here differs from the real query in that we omit some details to do with
parameter binding and denormalizing things like attribute identifiers.</p>
<pre class="code sql full-width literal-block"><code><span class="k">SELECT</span><span class="w"> </span><span class="n">datoms2</span><span class="p">.</span><span class="n">v</span><span class="w">
</span><span class="k">FROM</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms0</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms1</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms2</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms3</span><span class="w">
</span><span class="k">WHERE</span><span class="w"> </span><span class="n">datoms0</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">":book/title"</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms0</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">"The Complete Calvin and Hobbes"</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">":review/book"</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms0</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms2</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms2</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">":review/user"</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms3</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms3</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="ss">":review/score"</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms3</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mi">1</span></code></pre>
<p>In practice, the program takes a sequence of patterns and, optionally,
parameters to control sorting, limiting, and what to output.</p>
<p>To find datoms for the <em>:book/title</em> attribute:</p>
<pre class="code shell full-width literal-block"><code>$<span class="w"> </span>owoof<span class="w"> </span><span class="s1">'?b :book/title ?t'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--desc<span class="w"> </span><span class="s1">'?b :book/avg-rating'</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="s2">"The Complete Calvin and Hobbes"</span>,<span class="w">
</span><span class="s2">"Harry Potter Boxed Set, Books 1-5 (Harry Potter, #1-5)"</span>,<span class="w">
</span><span class="s2">"Words of Radiance (The Stormlight Archive, #2)"</span>,<span class="w">
</span><span class="s2">"ESV Study Bible"</span>,<span class="w">
</span><span class="s2">"Mark of the Lion Trilogy"</span>,<span class="w">
</span><span class="s2">"It's a Magical World: A Calvin and Hobbes Collection"</span>,<span class="w">
</span><span class="s2">"Harry Potter Boxset (Harry Potter, #1-7)"</span>,<span class="w">
</span><span class="s2">"There's Treasure Everywhere: A Calvin and Hobbes Collection"</span>,<span class="w">
</span><span class="s2">"Harry Potter Collection (Harry Potter, #1-6)"</span>,<span class="w">
</span><span class="s2">"The Authoritative Calvin and Hobbes: A Calvin and Hobbes Treasury"</span><span class="w">
</span><span class="o">]</span></code></pre>
<p>Notice that we only saw the titles. If no parameters are passed to specify what data to show
from the query, the program will look for a variable with no constraints and use that.<a class="footnote-reference superscript" href="#footnote-4" id="footnote-reference-4" role="doc-noteref"><span class="fn-bracket">[</span>4<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-4" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-4">4</a><span class="fn-bracket">]</span></span>
<p>In this case, <code>?t</code> was unconstrained, so the values for <code>?t</code> that matched entries in
our database were shown.</p>
<p>In the diagram further above, showing the book review query as a graph, the <code>?u</code> variable
in the query corresponds to the <em>v</em> cell of the top-left triad of boxes. In the graph, we see
there are no edges that end at that cell, showing that the <code>?u</code> variable as
no constraints.</p>
</aside>
</aside>
</aside>
<p>The <code>--show</code> option in the program allows controlling the shape of the output a bit.</p>
<p>Either by specifying a variable with a discrete value …</p>
<pre class="code shell literal-block"><code>$<span class="w"> </span>owoof<span class="w"> </span><span class="s1">'?b :book/title "The Complete Calvin and Hobbes"'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?b :book/authors ?a'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?a'</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="s2">"Bill Watterson"</span><span class="w">
</span><span class="o">]</span></code></pre>
<p>… or a variable to an entity and one or more attributes to be shown for that entity.</p>
<pre class="code shell literal-block"><code>$<span class="w"> </span>owoof<span class="w"> </span><span class="s1">'?b :book/title "The Complete Calvin and Hobbes"'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?b :book/authors'</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/authors"</span>:<span class="w"> </span><span class="s2">"Bill Watterson"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span></code></pre>
<p>Looking for book ratings with a score of 1 for books where the author is “Dan Brown”.
The <code>--show</code> parameter is used to ask for two objects back for each result, one showing
the user of the rating and another for the book title and ISBN.</p>
<pre class="code shell literal-block"><code>$<span class="w"> </span>owoof<span class="w"> </span><span class="s1">'?r :rating/score 1'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?r :rating/book ?b'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?b :book/authors "Dan Brown"'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?r :rating/user'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?b :book/title :book/isbn'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--desc<span class="w"> </span><span class="s1">'?b :book/avg-rating'</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":rating/user"</span>:<span class="w"> </span><span class="m">9</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Angels & Demons (Robert Langdon, #1)"</span>,<span class="w">
</span><span class="s2">":book/isbn"</span>:<span class="w"> </span><span class="s2">"1416524797"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":rating/user"</span>:<span class="w"> </span><span class="m">126</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/isbn"</span>:<span class="w"> </span><span class="s2">"1416524797"</span>,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Angels & Demons (Robert Langdon, #1)"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":rating/user"</span>:<span class="w"> </span><span class="m">406</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/isbn"</span>:<span class="w"> </span><span class="s2">"1416524797"</span>,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Angels & Demons (Robert Langdon, #1)"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span>,<span class="w">
</span><span class="c1"># (more output omitted for brevity...)</span></code></pre>
<p>This is an example of a more complicated query that shows unpopular books that were
rated highly among users that did not enjoy <cite>The Complete Calvin and Hobbes</cite>.</p>
<pre class="code shell full-width literal-block"><code><span class="w"> </span>$<span class="w"> </span>owoof<span class="w"> </span><span class="s1">'?calvin :book/title "The Complete Calvin and Hobbes"'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?rating :rating/book ?calvin'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?rating :rating/score 1'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?rating :rating/user ?u'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?more-great-takes :rating/user ?u'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?more-great-takes :rating/book ?b'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="s1">'?more-great-takes :rating/score 5'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?b :book/title :book/avg-rating'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--asc<span class="w"> </span><span class="s1">'?b :book/avg-rating'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--limit<span class="w"> </span><span class="m">10</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"The Short Second Life of Bree Tanner: An Eclipse Novella (Twilight, #3.5)"</span>,<span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.51<span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.57,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Twilight (Twilight, #1)"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.64,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"The Memory Keeper's Daughter"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.67,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"The Edible Woman"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Breaking Dawn (Twilight, #4)"</span>,<span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.7<span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Romeo and Juliet"</span>,<span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.73<span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"A Great and Terrible Beauty (Gemma Doyle, #1)"</span>,<span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.79<span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.8,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Northanger Abbey"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.81,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"The Taming of the Shrew"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":book/avg-rating"</span>:<span class="w"> </span><span class="m">3</span>.84,<span class="w">
</span><span class="s2">":book/title"</span>:<span class="w"> </span><span class="s2">"Of Mice and Men"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span></code></pre>
<p>The SQL query generated for the previous command looks like the following. Unlike the
example query shown earlier, this includes some gibberish that helps with value types
(the <em>t</em> column) as well as denormalization on attribute entities using their
identifier and entities using a handle that happens to be a UUID.
The only difference between what is shown below and what the program actually runs is
that I have placed the bound parameter values in comments above/near the each binding.
(Also, note that <em>attributes</em>
is not a table, but a view over datoms that match <code>?e :attr/ident ?v</code>)</p>
<pre class="code sql full-width literal-block"><code><span class="k">SELECT</span><span class="w"> </span><span class="n">datoms7</span><span class="p">.</span><span class="n">t</span><span class="p">,</span><span class="w"> </span><span class="k">CASE</span><span class="w"> </span><span class="n">datoms7</span><span class="p">.</span><span class="n">t</span><span class="w"> </span><span class="k">WHEN</span><span class="w"> </span><span class="o">-</span><span class="mi">1</span><span class="w"> </span><span class="k">THEN</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">uuid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">entities</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms7</span><span class="p">.</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="k">ELSE</span><span class="w"> </span><span class="n">datoms7</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="k">END</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms8</span><span class="p">.</span><span class="n">t</span><span class="p">,</span><span class="w"> </span><span class="k">CASE</span><span class="w"> </span><span class="n">datoms8</span><span class="p">.</span><span class="n">t</span><span class="w"> </span><span class="k">WHEN</span><span class="w"> </span><span class="o">-</span><span class="mi">1</span><span class="w"> </span><span class="k">THEN</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">uuid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">entities</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms8</span><span class="p">.</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="k">ELSE</span><span class="w"> </span><span class="n">datoms8</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="k">END</span><span class="w">
</span><span class="k">FROM</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms0</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms1</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms2</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms3</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms4</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms5</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms6</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms7</span><span class="w">
</span><span class="p">,</span><span class="w"> </span><span class="n">datoms</span><span class="w"> </span><span class="n">datoms8</span><span class="w">
</span><span class="c1">-- :book/title
</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">datoms0</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms0</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- "The Complete Calvin and Hobbes"
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms0</span><span class="p">.</span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- 0
</span><span class="w"> </span><span class="c1">-- :rating/book
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms0</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms2</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="c1">-- :rating/score
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms2</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms2</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- 1
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms2</span><span class="p">.</span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- 0
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms3</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms1</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="c1">-- :rating/user
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms3</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="c1">-- :rating/user
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms4</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms4</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms3</span><span class="p">.</span><span class="n">v</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms5</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms4</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="c1">-- :rating/book
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms5</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms6</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms4</span><span class="p">.</span><span class="n">e</span><span class="w">
</span><span class="c1">-- :rating/score
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms6</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms6</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- 5
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms6</span><span class="p">.</span><span class="n">t</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- 0
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms7</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms5</span><span class="p">.</span><span class="n">v</span><span class="w">
</span><span class="c1">-- :book/title
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms7</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">AND</span><span class="w"> </span><span class="n">datoms8</span><span class="p">.</span><span class="n">e</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">datoms5</span><span class="p">.</span><span class="n">v</span><span class="w">
</span><span class="c1">-- :book/avg-rating
</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">datoms8</span><span class="p">.</span><span class="n">a</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="k">SELECT</span><span class="w"> </span><span class="n">rowid</span><span class="w"> </span><span class="k">FROM</span><span class="w"> </span><span class="n">attributes</span><span class="w"> </span><span class="k">WHERE</span><span class="w"> </span><span class="n">ident</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="p">)</span><span class="w">
</span><span class="k">ORDER</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">datoms8</span><span class="p">.</span><span class="n">v</span><span class="w"> </span><span class="k">ASC</span><span class="w">
</span><span class="k">LIMIT</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- 10</span></code></pre>
<section id="scale-performance">
<h3>Scale & Performance<a class="self-link" title="link to this section" href="#scale-performance"></a></h3>
<p>The point of the program was to be able to generate queries like this from a
datalog-like syntax. Being a toy program/side-project without a real use case,
feasibility wasn’t a huge concern, but the performance is still interesting.
The query above takes about 400ms to run on my laptop when the database file is in my OS cache
(or a bit under 4.5s from disk). The database file is 826MB with just over six million
datoms.<a class="footnote-reference superscript" href="#footnote-5" id="footnote-reference-5" role="doc-noteref"><span class="fn-bracket">[</span>5<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-5" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-5">5</a><span class="fn-bracket">]</span></span>
<p>For comparison, the dataset used has 3.3MB of books and 72MB of ratings in a csv
format, and I think I only imported about a quarter of the ratings.</p>
</aside>
</aside>
</aside>
<p>The dataset used is <a class="reference external" href="https://github.com/zygmuntz/goodbooks-10k">goodbooks-10k</a>.</p>
<p>Query performance has been flaky at times where minor changes in the given
patterns/constraints cause SQL query plans to change significantly enough that some
queries (that are otherwise sub-second) never finish.
Specifically, SQLite may either select <em>less than ideal</em> indexes to use for some sets
of datoms<a class="footnote-reference superscript" href="#footnote-6" id="footnote-reference-6" role="doc-noteref"><span class="fn-bracket">[</span>6<span class="fn-bracket">]</span></a> or change the order that sets of datom are fetched in a way that is
detrimental.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-6" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-6">6</a><span class="fn-bracket">]</span></span>
<p>Recall that each “set of datoms” correspond to the group of <em>e</em>, <em>a</em>, <em>v</em> boxes in the
earlier diagram <em>and</em> to each aliased “datoms” table in the FROM clause of the SQL
query above.</p>
<p>For each entry in the FROM clause, SQLite will typically use one index to populate
it.</p>
</aside>
</aside>
</aside>
<p>The dataset used happened to be good at pointing this out. I import ten thousand
books and about one and a half million book ratings, so the cardinality of these “objects” is
significantly inequal.</p>
<p>Normally, this would be stored in <em>different</em> tables and have <em>different</em> indexes. This
would allow SQLite to have a good idea of how big indexes will be and which tables to
start with in a query involving joins.
But, in my case, everything is in one big table of datoms.</p>
<p>Take, for example, a query that looks for books by a specific author (using <em>:book/authors</em>)
and joins that with ratings with a specific score (using <em>:rating/score</em>; a one-to-five scale).
Starting by looking for ratings will be a slower query than starting with the books
since there are far more ratings than books, particularly with these filters since the
cardinality of <em>:book/authors</em> is greater than <em>:rating/score</em>. In other words,
there are many more values for <em>:book/authors</em> than for <em>:rating/score</em>, so filtering by
a specific score doesn’t reduce the size of our selection by as much as filtering by
specific authors.</p>
<p>In fact, because of the <em>difference</em> in the number of rating in the database compared to the
number of books,
filtering for a <em>specific</em> value of <em>:rating/score</em> will yield
more datoms than filtering for <em>any</em>/<em>all</em> values of <em>:book/title</em> even though the
former is a <em>stronger</em> constraint.
Ultimately, since all the data is in one table and they share the same indexes,
SQLite might have trouble knowing the cardinality of these indexes and might not have a
hard time producing efficient query plans.</p>
<p>Running the <a class="reference external" href="https://sqlite.org/lang_analyze.html">ANALYZE</a> command has solved every instance <em>so far</em>
where a query could not complete due to a bad query plan.
But I haven’t tested this with very many complicated queries or other data sets.</p>
</section><section id="schema-reflection">
<h3>Schema & Reflection<a class="self-link" title="link to this section" href="#schema-reflection"></a></h3>
<p>Aside from querying data, we can assert or retract datoms to add or remove information
from the database. Datomic keeps a transaction and history of every modification so that
retractions don’t actually <em>remove</em> information but just mark it as
no-longer-being-a-fact-at-some-monotonically-increasing-transaction-level. This is
cool for reasons but my toy project does not do this so I won’t talk about it.</p>
<p>Instead, assertions and retractions literally mean inserting and deleting datoms.</p>
<p>Another feature that Datomic advertises is a self-referential or reflective interface
for defining a schema. Datoms let you model books, book ratings, animals, pets,
whatever your application domain is about.
But database properties themselves are also datoms. Datomic has a rich
interface in this respect, allowing <em>attributes</em> to receive <em>attributes</em> to <a class="reference external" href="https://docs.datomic.com/cloud/schema/schema-reference.html">specify things like
documentation, uniqueness, value type</a>. To clarify, attributes have attributes because
they are entities in the database.</p>
<p>My program implements this to a lesser and slightly broken extent.</p>
<p>Shown below is every entity, attribute, value in a <em>new</em> database.<a class="footnote-reference superscript" href="#footnote-7" id="footnote-reference-7" role="doc-noteref"><span class="fn-bracket">[</span>7<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-7" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-7">7</a><span class="fn-bracket">]</span></span>
<p>The command line interface only knows how to group datoms into objects if you ask for
specific attributes. So instead we asked for the <em>entity</em>, <em>attribute</em>, <em>value</em> tuple
and grouped them by their entity with <code>jq</code>.</p>
</aside>
</aside>
</aside>
<pre class="code shell literal-block"><code>$<span class="w"> </span>owoof<span class="w"> </span><span class="s1">'?e ?a ?v'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?e'</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?a'</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?v'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">'group_by(.[0]) | .[] | map({(.[1]): .[2]}) | add'</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":attr/unique"</span>:<span class="w"> </span><span class="m">1</span>,<span class="w">
</span><span class="s2">":attr/ident"</span>:<span class="w"> </span><span class="s2">":entity/uuid"</span>,<span class="w">
</span><span class="s2">":entity/uuid"</span>:<span class="w"> </span><span class="s2">"#26eb5b7a-f6cb-4079-8b48-738c32a4c954"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":attr/ident"</span>:<span class="w"> </span><span class="s2">":attr/unique"</span>,<span class="w">
</span><span class="s2">":entity/uuid"</span>:<span class="w"> </span><span class="s2">"#378f5726-9cf0-40ba-8987-7e2beef3920e"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":attr/unique"</span>:<span class="w"> </span><span class="m">1</span>,<span class="w">
</span><span class="s2">":attr/ident"</span>:<span class="w"> </span><span class="s2">":attr/ident"</span>,<span class="w">
</span><span class="s2">":entity/uuid"</span>:<span class="w"> </span><span class="s2">"#96e193ad-5f39-4820-9c3e-651e95c62ca4"</span><span class="w">
</span><span class="o">}</span></code></pre>
<p>This shows that a new database has three entities.</p>
<p>Remember that the point of the database is that the <em>attributes</em> that any <em>entity</em> has
is what makes it meaningful in a context.
For example, giving some entity a value for the <em>:attr/ident</em> attribute (which stands
for an attribute identifier) is what is required for that entity to be an attribute in
the database.</p>
<p>Each of the three entities in the result above has an <em>:attr/ident</em>, showing that they are all
attributes.</p>
<p>The first, is the attribute for entity handles that allow us to reference the entity and
identify it uniquely in the system. We can match on <em>:entity/uuid</em> in our queries to
search for specific entities or include an <em>:entity/uuid</em> in an assertion (an insert) to
add datoms about an existing entity instead of creating a new one.</p>
<p>The second, is an attribute for specifying uniqueness. The presence of that attribute
means that the attribute and value pair must be unique together for the entire database. For some attributes,
like <em>:book/authors</em> or <em>:pet/name</em>, it should be valid to have the same value for
different entities, like if you have books by the same authors or pets with the same name.
But this is not valid in some contexts, like attribute identifiers. Having multiple
attributes with the same identifier is not meaningful to the database.
Setting this attribute on the attribute identifier entity enforces unique values for attribute identifiers.</p>
<p>The third entity is the entity for attribute identifiers. The way this works is a little
weird but <em>(spoiler alert)</em> a clue is in the value for the “:attr/ident” key of this
entity in the JSON object shown above.</p>
<p>Recall that attributes are just entities with values for the <em>:attr/ident</em> attribute. To
create an attribute, we can assert a datom about a new or existing entity
with the <em>:attr/ident</em> attribute and a value like “:pet/name” or something.</p>
<pre class="code shell full-width literal-block"><code>$<span class="w"> </span>jq<span class="w"> </span>-n<span class="w"> </span><span class="s1">'{":attr/ident": ":pet/name"}'</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>owoof<span class="w"> </span>assert<span class="w">
</span><span class="s2">"#1d1bb352-b157-4cd9-ba14-a900b4eb88d7"</span><span class="w">
</span>$<span class="w"> </span>owoof<span class="w"> </span><span class="s1">'1d1bb352-b157-4cd9-ba14-a900b4eb88d7 :attr/ident ?v'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span>--show<span class="w"> </span><span class="s1">'?v'</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="s2">":pet/name"</span><span class="w">
</span><span class="o">]</span><span class="w">
</span>$<span class="w"> </span>jq<span class="w"> </span>-n<span class="w"> </span><span class="s1">'{":pet/name": "Garfield"}'</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>owoof<span class="w"> </span>assert<span class="w">
</span><span class="s2">"#1e0ca035-e824-4f08-b51b-09faa51c7ec9"</span><span class="w">
</span>$<span class="w"> </span>owoof<span class="w"> </span>--show<span class="w"> </span><span class="s1">'?_ :pet/name :entity/uuid'</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">":entity/uuid"</span>:<span class="w"> </span><span class="s2">"#1e0ca035-e824-4f08-b51b-09faa51c7ec9"</span>,<span class="w">
</span><span class="s2">":pet/name"</span>:<span class="w"> </span><span class="s2">"Garfield"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span></code></pre>
<p>The interesting bit is in how we create the <em>:attr/ident</em> attribute in the first place.</p>
<p>Generally speaking,
given an entity <em>e</em> and a value <em>v</em>, for <em>e</em> to be an attribute, there must exist a
datom <em>e a v</em> where <em>a</em> is an entity that is the <em>:attr/ident</em> attribute.</p>
<p>Unfortunately, that definition is a bit recursive. But we can apply it to
our <em>:attr/ident</em> attribute, where <em>v</em> is <em>:attr/ident</em>.</p>
<p>Then, there must exist a datom <em>e a :attr/ident</em> where <em>a</em> is an attribute with the value
<em>:attr/ident</em>. And it turns out that if <em>a</em> is equal to <em>e</em> then it works. And since the
<em>:attr/ident</em> attribute has values that are unique for that attribute (there is at most
one <em>:attr/ident</em> attribute), then <em>e</em> must equal <em>a</em>.</p>
<p>So the datom for the <em>:attr/ident</em> attribute references the same entity in its <em>entity</em>
and <em>attribute</em> fields.</p>
<p>The larger point is that the interface around
information about your domain models is the same as for database models. You can query
and assert things about database attributes in the same way as you can about pet names
or recipes or whatever matters. And I think that’s super cool.</p>
<p>I have put the <a class="reference external" href="https://github.com/sqwishy/owoof">source code for Owoof</a> on GitHub.</p>
<aside class="aside">
<p>Source: <a class="reference external" href="https://youtube.com/watch?v=83m261lAlrs">https://youtube.com/watch?v=83m261lAlrs</a></p>
</aside>
<video src="Don't woof-83m261lAlrs.webm" controls></video></section></section><p>Shilling for <a class="reference external" href="https://www.datomic.com/">Datomic</a>; aimless conjecture on linguistics & pattern
matching; and a report on my poor use of <a class="reference external" href="https://www.sqlite.org/">SQLite</a>.</p>
2020-05-25T12:00:00-07:00https://froghat.ca/2020/01/some-things-i-did-in-2019Some things I did in 20192020-01-21T12:00:00-08:00sqwishy<p>At the end of 2018, I was very interested in learning <a class="reference external" href="https://www.rust-lang.org/">Rust</a> and using <a class="reference external" href="https://www.timescale.com/">TimescaleDB</a> to
build an analytics platform for <a class="reference external" href="https://www.twitch.tv/">Twitch.tv</a> (<em>click with caution;</em> loud and obnoxious auto-playing
videos).</p>
<aside class="aside">
<p><em>Content Warning:</em> Photographs of pizza have been placed in this article to break up the
text.</p>
</aside>
<section id="comfy-sheep">
<h2>Comfy Sheep<a class="self-link" title="link to this section" href="#comfy-sheep"></a></h2>
<p>For the first five-ish months of 2019, I worked on <a class="reference external" href="https://gitlab.com/sqwishy/comfy-sheep/">Comfy Sheep</a>. There’s a big writeup in
that project readme about what is all there. Briefly, there is a program that scrapes
the Twitch.tv API for live-streams and another program that logs chat messages/stream events for
the top 60k-100k live-streams. The data collected is stored in a PostgreSQL database
that uses TimescaleDB.</p>
<p>The idea was to use the data to inform predictions and decision making about
live-streaming. But I don’t know how to do that and I don’t know anybody who does.
I guess I didn’t think that through.</p>
<p>TimescaleDB was fun to use and it worked well. Except one time it had a bug that would
segfault the PostgreSQL process running the query, causing PostrgreSQL to restart
whenever the chat logger tried to write data. When the database would come back up, the
logger would connect to it again, try to write the same message again, and trigger a
crash again. Fortunately, the bug had already been fixed and all I needed to do was
package the new version<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>*<span class="fn-bracket">]</span></a> and update it on the host.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">*</a><span class="fn-bracket">]</span></span>
<p>I copied a package from <a class="reference external" href="https://build.opensuse.org/">build.opensuse.org</a> so I didn’t have to start from
scratch. And I had <a class="reference external" href="https://copr.fedorainfracloud.org/">copr.fedorainfracloud.org</a> build the binary distribution for
the version of OpenSuse that the host was running. I’m grateful that those were
available for me to use.</p>
</aside>
</aside>
</aside>
<p>Partial indexes were very useful. A job would look at viewership over the past week
to determine which streams should be logged and how to distribute streams to loggers so
that they might receive roughly equal amounts of chat activity.
The query used an index that only contained samples with at least some number of viewers,
reducing the size of the index by about 80%. This, and using vacuum to update the
<a class="reference external" href="https://www.postgresql.org/docs/11/storage-vm.html">visibility map</a>, were the difference between the job taking eight seconds instead of
eight hours or more.</p>
<p>Rust’s borrow checker and its very popular asynchronous runtime named “tokio” gave me a hard time.
There is a library called <a class="reference external" href="https://github.com/tokio-rs/mio">mio</a> (which falls under the tokio project umbrella) that is a
very nice wrapper around polling. I wish there was more effort going into figuring out how to
write libraries that could be used with event loops generally rather than large, general
purpose, all-consuming runtimes. If I want to write something single threaded that just
listens and sends on sockets, tokio seems like shooting a fly with a cannon. But
minimalism and Rust don’t really seem to go together.</p>
<img alt="pizza with bazel, mozzerella, and sliced farmer sausage" src="20200108_0002.jpg" />
</section><section id="super-serious-timer-business">
<h2>Super Serious Timer Business<a class="self-link" title="link to this section" href="#super-serious-timer-business"></a></h2>
<p>I wanted to do something with WebAssembly in Rust. In March I wrote a very simple timer
kind of thing
named <a class="reference external" href="https://gitlab.com/sqwishy/super-serious-timer-business">Super Serious Timer Business</a> and put it on GitLab. It uses a Rust library
called <a class="reference external" href="https://github.com/yewstack/yew">Yew</a>. Which is purportedly inspired by <a class="reference external" href="https://elm-lang.org/">Elm</a>.</p>
<p>I saw Elm later that year while looking for reactive-streaming-functional-pipeline things.
I was interested in RxJS, Rambda, Most, <a class="reference external" href="https://github.com/fantasyland/fantasy-land">Fantasy Land</a> – among others –
and was hoping to find a way to use <a class="reference external" href="https://infernojs.org/">Inferno</a> with these mechanisms <em>painlessly-ish</em>.</p>
<aside class="aside">
<p>It turns out the most painless JavaScript web development frameworks are languages
other than JavaScript. Who knew?</p>
</aside>
<p>At some point I saw something talking about how you can use these sort of things to
create functions that hook up to events and evaluate to DOM elements. And then I think
in that same article they mentioned Elm.</p>
<p>So I tried Elm and it was very fun.
It has some nice quality-of-life stuff like quick compile times (I believe the Elm
compiler is written in Haskell) and a nice code formatter.
I, personally, quite like whitespace as syntax and prefer that to marking up my
code with squiggly braces and semicolons – but that’s just me (no hate pls).</p>
<p>I’m looking forward using more of Elm in the future as a gateway drug to functional
programming.</p>
</section><section id="scm-rights">
<h2>SCM_RIGHTS<a class="self-link" title="link to this section" href="#scm-rights"></a></h2>
<p>In May I put up this blag<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>†<span class="fn-bracket">]</span></a> and wrote a program that used a feature of UNIX sockets called
SCM_RIGHTS that allows duplicating file descriptors to other processes. <a class="reference external" href="/blag/scm_rights/">I blagged about it</a>.</p>
<aside class="footnote-list superscript">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">†</a><span class="fn-bracket">]</span></span>
<p>I recently shared the template here: <a class="reference external" href="https://git.sr.ht/~sqwishy/froghat.ca">https://git.sr.ht/~sqwishy/froghat.ca</a></p>
</aside>
</aside>
<img alt="slightly more burnt pizza with bazel, mozzerella, and sliced farmer sausage" src="20200108_0008.jpg" />
</section><section id="containers">
<h2>Containers<a class="self-link" title="link to this section" href="#containers"></a></h2>
<p>After that, I got sad for reasons and I played a lot of video games. But then I stopped
being sad and started working something that would let me manage and deploy game
servers. I never finished it. I think because I kept increasing the scope and eventually
it got hard and I got distracted.</p>
<p>I put some of the stuff I wrote for it <a class="reference external" href="https://gitlab.com/sqwishy/shift-own">up on GitLab</a>.</p>
<p>The container stuff worked like this.</p>
<section id="runtime">
<h3>Runtime<a class="self-link" title="link to this section" href="#runtime"></a></h3>
<p>The plan was to use <code>systemd-nspawn</code> (or possibly <code>runc</code>) to run containers.
Both of these seemed very low-drama tools for creating namespaces that supported
important things like uid mapping and seccomp. They also can set up a bit of
networking if so desired. But I almost prefer to prepare the networking with iproute2
(like <code>ip link ...</code>) through systemd services. Using systemd for the network
configuration lets you model some things like dependencies, automatic restarts, and
logging or triggers on service failure.</p>
<p>A lot of the services are instanced (the unit name ends in a “@”), so there might only
be one unit file for creating network veth pairs (named “container-veth@.service” or
something), but we can start multiple instances of the services by putting a string
after the “@” (like “container-veth@bob.service”) and the service file can parameterize
its behaviour on the instance name.</p>
<p>For more control or variation among instances of a unit, we can place extra
configuration in a drop-in directory. For example, if each guest container on some host
is an instance of the “container@.service” unit. Then we can add extra
configuration for the “bob” container by writing it to
“container@bob.service.d/50-extra-stuff.conf”.</p>
</section><section id="volumes">
<h3>Volumes<a class="self-link" title="link to this section" href="#volumes"></a></h3>
<p>Other than configuration, containers need a root filesystem directory tree (rootfs) to run in.
This is where I got bogged down on a bunch of weird edge cases that I tried to model.</p>
<p>To simplify here, we’ll say that containers use a rootfs and the host can can get a
rootfs by extracting an image to a directory somewhere.</p>
<p>It seemed like a good way to to manage images was through tool called <a class="reference external" href="https://github.com/systemd/casync">casync</a>. We can
give it a target directory tree, like our game & operating system, and it breaks all that up
into chunks, stores them, and gives us an index (.caidx) that we can use later to
reassemble the directory from the chunks.</p>
<p>The chuck storage is re-used for different indexes too. So, in the future, when I want
to make a new image of an updated version of this game, chunks that haven’t changed are
not written again.</p>
<p>The deltas operate at the level of chunks rather than files. This is particularly nice
for games which might ship a large binary blob with only a few differences in it from
the previous version.</p>
<p>Chunks themselves are compressed by casync. I found that using zstd for compression was
much faster.</p>
<p>The indexes and the chunks that casync creates serve as our images. We can
give casync an index of a container filesystem and it will extract it for us. It can
also fetch chunks over the network through SSH if they not stored locally.</p>
<img alt="also a pizza with bazel, mozzerella, and sliced farmer sausage" src="20200108_0014.jpg" />
</section><section id="isolation">
<h3>Isolation<a class="self-link" title="link to this section" href="#isolation"></a></h3>
<p>One part of container isolation is mapping each user in a container to an otherwise
unused user on the host.</p>
<p>If you segment out the group and user ranges on the host into 16 bit sized ranges, you
can create reservations for your containers that look like this:</p>
<pre class="code literal-block"><code>0 root on host
1000 user on host
0x10000 root on container0
0x10000+1000 user on container0
0x20000 root on container1
0x20000+1000 user on container1
...</code></pre>
<p>The goal is that no range for the host or its guests are overlapping.</p>
<p>An issue emerges where if container0 and container1 both use the same image, then the
host needs two copies of it with different owners.</p>
<p>A solution to this is to provide something like a bind mount that allows accessing
some path with shifted ownerships.
I think something called shiftfs has been trying to make its way into Linux
for a while. And it looks like Ubuntu might already ship with it since whenever I search for shiftfs I
find a bunch of Ubuntu security notices related to it.</p>
<p>There is also a FUSE implementation of overlayfs called <a class="reference external" href="https://github.com/containers/fuse-overlayfs">fuse-overlayfs</a> that has an
owner shifting feature. But, since that’s FUSE, that’s automatically removed from
consideration.</p>
<p>The approach I chose was to use a feature of
overlayfs (accessible with the <code>metacopy=on</code> option) which allows modifying file
attributes in an overlay without copying the file contents up from the lower layer.</p>
<p>The host then keeps only one copy of each image that its guests are using. When a guest
uses an image, we mount an overlay for that guest with the image as the lower layer
and shift the owner of every file in the overlay to be suitable for the guest.</p>
<p>During this escapade with containers and image management, I wrote several tools to help
make things work. I want to salvage a couple so I yoinked them out of the project they
were in and fixed them up a bit (rewrote them) so I could publish them along with this
blag post.</p>
<ul>
<li><p><a class="reference external" href="https://gitlab.com/sqwishy/shift-own/tree/master/shift-own">shift-own</a> is a binary that lets you chmod a file or directory tree according to some
shift and range. So if you have 0x10000 sized reservations and want to make a file or
directory (and everything under it) accessible to the reservation starting at 0x30000,
then …</p>
<p><code>shift-own -s 0x30000 -r 0x10000 path/to/whatever</code></p>
<p>… will do that.</p>
</li>
<li><p><a class="reference external" href="https://gitlab.com/sqwishy/shift-own/tree/master/shift-mount">shift-mount</a> will create an overlay with <code>metacopy=on</code> and run the ownership shifting
in the overlay.</p></li>
</ul>
<p>Here’s an example of both.</p>
<pre class="code full-width literal-block"><code># ./shift-mount --oneshot /opt/alpine/ /opt/alpine-int/ /tmp/alpine-shifted/
Mounted /tmp/alpine-shifted/
4484 files under /tmp/alpine-shifted/ shifted with 0x0 using range 0x10000
# ./shift-own -s 0x30000 /tmp/alpine-shifted/ -v
0:0 -> 196608:196608 .. /tmp/alpine-shifted/
0:0 -> 196608:196608 .. /tmp/alpine-shifted/proc
0:0 -> 196608:196608 .. /tmp/alpine-shifted/proc/self
0:0 -> 196608:196608 .. /tmp/alpine-shifted/proc/self/uid_map
0:0 -> 196608:196608 .. /tmp/alpine-shifted/usr
0:0 -> 196608:196608 .. /tmp/alpine-shifted/usr/lib
0:0 -> 196608:196608 .. /tmp/alpine-shifted/usr/lib/libip4tc.so.2
0:0 -> 196608:196608 .. /tmp/alpine-shifted/usr/lib/libreadline.so.8
0:0 -> 196608:196608 .. /tmp/alpine-shifted/usr/lib/engines-1.1
0:0 -> 196608:196608 .. /tmp/alpine-shifted/usr/lib/engines-1.1/afalg.so
...</code></pre>
<p>That’s a bad example because I could have passed <code>-s 0x30000</code> to <code>shift-mount</code>
and it would have done what <code>shift-own</code> did. But you get the idea…</p>
<img alt="another pizza with bazel & mozzerella" src="20200108_0034.jpg" />
</section></section><p>Pizza & Rust.</p>
2020-01-21T12:00:00-08:00https://froghat.ca/2020/10/markup-rantMarkup Rant2020-10-10T12:00:00-07:00sqwishy<aside class="aside">
<p><em>tldr</em> I’m frustrated by the technology I’m using. I complain about some of it without offering a path for improvement.</p>
<p>Like me, the subjects I cover are small and insignificant; it’s not poverty, education, climate, or geopolitics. But I’ve been feeling crummy lately and the only thing I can be fucked to do is complain and be petty so … here we are.</p>
</aside>
<section id="what-s-a-markup">
<h2>What’s a Markup?<a class="self-link" title="link to this section" href="#what-s-a-markup"></a></h2>
<p>Markup languages let you annotate text files with other text that kind of looks like something a normal person can read or would write. But those annotations have special meaning to programs when they read the document and turn it into another format like HTML.</p>
<p>Each of the following two examples of markup might produce the heading (and the first few words) for the next section of this post.</p>
<ul>
<li><p>In reStructuredText:</p>
<pre class="code rest reflow literal-block"><code><span class="gh">Markup</span>
<span class="gh">------</span>
<span class="w">
</span>Years ago, blah blah blah ...</code></pre>
</li>
<li><p>In Markdown:</p>
<pre class="code md reflow literal-block"><code><span class="gu">## Markup</span>
<span class="w">
</span>Years ago, blah blah blah ...</code></pre>
</li>
</ul>
</section><section id="who-cares-about-markup">
<h2>Who cares about markup?<a class="self-link" title="link to this section" href="#who-cares-about-markup"></a></h2>
<p>Years ago, I started familiarizing myself with <abbr title="reStructuredText">reST</abbr> and have been using it ever since. I picked that over Markdown for a few reasons including:</p>
<ul>
<li><p>Markdown’s lack of support for basic markup elements, like tables or footnotes.</p>
<p>Here’s some reStructuredText:</p>
<pre class="code rest reflow literal-block"><code>Wow. Look at this auto-numbered footnote. <span class="s">[#]_</span><span class="w">
</span><span class="p">..</span> <span class="nt">[#]</span> This is the footnote text!</code></pre>
</li>
<li><p>reST’s comparatively richer set of features for basic content, like using a hyphen (<code>-</code>) to denote unordered list items, or anonymous hyperlink targets.</p>
<pre class="code rest reflow literal-block"><code>Read what Wikipedia has to say about <span class="s">`markup languages`__</span>.<span class="w">
</span>__ https://wikipedia.org/wiki/Lightweight_markup_language</code></pre>
</li>
</ul>
<p>Since then, Markdown implementations have extended what they support, narrowing the gap between their markup features and those of reStructuredText implementations.</p>
<p>However, the recourse for <em>extending</em> the markup is different in somewhat interesting ways.</p>
</section><section id="more-markup">
<h2>More Markup<a class="self-link" title="link to this section" href="#more-markup"></a></h2>
<p>As far as I can tell, if you want something in your document that isn’t supported by Markdown, your option is to write inline HTML (or use a preprocessor that generates the inline HTML).</p>
<p>This is how I’ve seen people write markup for <sub>subscript</sub>, <sup>superscript</sup>, <code><abbr></code>, <code><kbd></code>, <code><figure></code>, or <code><picture></code> elements in HTML that don’t seem to have their own markup in any Markdown implementations. This is a problem for me for a couple reasons.</p>
<ul>
<li><p>Inline HTML is, in my opinion, ugly to write and painful to read.</p>
<p>I feel obligated to state that this is <em>just</em> my preference. But I’d bet most human people would find explicit <code><p align="center"></code> and rows of <code></br></code> tags, designed to affect some sort of layout or whitespace that only makes sense when rendered as HTML on GitHub, to be a detriment to the readability of the file as text.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>Receiving emails with only HTML attachments (or very poorly generated plain text attachments) is similar case where HTML is an obnoxious markup for text.</p>
</aside>
</aside>
</aside>
<p>Writability matters <em>somewhat</em> less if you use a preprocessor. But that’s like saying that Markdown doesn’t have this problem if you don’t write Markdown and just generate it from some other format.</p>
</li>
<li><p>Inline HTML can limit our ability to render our document to other file formats.</p>
<p>I <em>can</em> understand most people only caring about rendering to HTML. For me, I think it’d be really cool if I could serve my website over <a class="reference external" href="https://gemini.circumlunar.space/">Gemini</a> in the <code>text/gemini</code> format. This is possible by mapping a markup language’s document structure to <code>text/gemini</code> directly and generate that along side HTML. But, to support inline HTML in Markdown, you’d need to convert HTML as well.</p>
</li>
<li><p>Inline HTML can reduce portability to other websites.</p>
<p>Including specific HTML markup in a Markdown file often assumes stuff about how the page is arranged, particularly for sectioning or heading content. (This is something I’ll bring up again later – because I am petty and just can’t let it go.)</p>
</li>
</ul>
<p>Docutils provides a Python API to extend the markup while still using reStructuredText syntax.</p>
<p>For this website, I’ve extended it so that <code>:hl-purple:`purple highlight`</code> & <code>:hl-yellow:`yellow highlight`</code> can be used to produce <span class="hl-purple">purple highlight</span> & <span class="hl-yellow">yellow highlight</span> respectively. In HTML, those elements are just <code><span></code> elements with a special class that receives a highlight from my stylesheet. In this case, it does exactly what I want it to do and I don’t have to shit up my file format with some out-of-band syntax.</p>
</section><section id="however">
<h2>However…<a class="self-link" title="link to this section" href="#however"></a></h2>
<p>… reStructuredText has pain points.</p>
<section id="poor-support-for-nested-inline-markup">
<h3>Poor support for nested inline markup.<a class="self-link" title="link to this section" href="#poor-support-for-nested-inline-markup"></a></h3>
<p>You can’t nest inline content in order to combine any of: hyperlink, emphasis, bold, strikethrough, pre-formated/code, <code><small></code>, superscript, subscript, etc.</p>
<p>You can hack it by defining new inline roles that specify a combination of these things but that’s annoying.</p>
</section><section id="extending-block-stuff-can-be-difficult">
<h3>Extending block stuff can be difficult.<a class="self-link" title="link to this section" href="#extending-block-stuff-can-be-difficult"></a></h3>
<p>To support swapping diagrams between light and dark mode, I have two custom directives for <code><picture></code> and <code><source></code> HTML elements. It’s about three dozen lines of Python to define the directives and render them to HTML. The source for my modifications <a class="reference external" href="https://git.sr.ht/~sqwishy/froghat.ca/tree/public/pelext/rst.py">are on sr.ht</a>.</p>
<p>This lets me write:</p>
<pre class="code rst full-width literal-block"><code><span class="p">..</span> <span class="ow">picture</span><span class="p">::</span> foo.svg<span class="w">
</span> <span class="nc">:alt:</span> This is some alt text!<span class="w">
</span><span class="p"> ..</span> <span class="ow">source</span><span class="p">::</span><span class="w">
</span> <span class="nc">:srcset:</span> foo-dark-mode.svg<span class="w">
</span> <span class="nc">:media:</span> (prefers-color-scheme: dark)</code></pre>
<p>It’s not amazing; but it’s not terrible. If I’m being pedantic, this probably should use light-mode/dark-mode semantics and not be coupled to CSS/media queries.</p>
<p>One thing I appreciate in Markdown is that block content need not be indented, like when using the markup for pre-formatted text/code fence blocks.</p>
<pre class="code md full-width reflow literal-block"><code>Top-level content.<span class="w">
</span><span class="sb">
```
Block content without indentation.
```</span></code></pre>
<p>Much of the time, indentation is nice to have. But, recently, I wanted a <code><details></code> element but, for reasons, I didn’t want to indent all the content contained therein. In reStructuredText I’m not sure there’s a way to establish block content without indentation.</p>
<p>Using some kind of delimiter to start and end blocks like backticks or tildes is a neat way to do it. I think <a class="reference external" href="https://en.wikipedia.org/wiki/Here_document">heredocs</a> are pretty cool and something like that would maybe be pretty
text friendly.</p>
</section><section id="there-are-not-many-great-implementations">
<h3>There are not many great implementations.<a class="self-link" title="link to this section" href="#there-are-not-many-great-implementations"></a></h3>
<p>Notable implementations are:</p>
<ol class="arabic simple">
<li><p><a class="reference external" href="https://docutils.sourceforge.io/">docutils</a>; Python</p></li>
<li><p><a class="reference external" href="https://pandoc.org/">pandoc</a>; Haskell</p></li>
</ol>
<p>Many of the other implementations I’ve found often only support a subset of reStructuredText.<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>To be fair, Markdown implementations have varying degrees of markup support as well. What markup is even being referred to when someone say “Markdown” is not always entirely clear. Last I heard, the <a class="reference external" href="https://commonmark.org/">CommonMark</a> specification was supposed to solve this but it doesn’t even specify for strike-through, tables, or footnotes. So now things will just say that they implement CommonMark as well as five or six other things as to resemble GitHub Flavored Markdown.</p>
</aside>
</aside>
</aside>
<p>I assume that this is evidence that reStructuredText is more difficult to implement. It probably also has to do with it being a little less popular. On the other hand, the implementation language shouldn’t matter much of the time.</p>
<p>Typically, there is no need to extend the document model<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a> in the way that I have. All you care about is giving your document to some program and getting some HTML out.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>Even if you do, I believe there are ways of doing this in your document using reStructuredText’s syntax, instead of through some API in the particular implementation you are using. And, of course, there is a <code>raw</code> directive that allows you to write plain HTML, providing a similar escape hatch to Markdown.</p>
</aside>
</aside>
</aside>
<p><a class="reference external" href="https://www.sphinx-doc.org/en/master/">Sphinx</a> is a notable exception where extensions exist for the purpose of improving the tool’s function as a documentation generator. With my website, extensions exist to do with my specific styling and layout. These specifics <em>should</em> apply to reStructuredText documents it handles generally.</p>
<p>This is the most/only important point of this post.</p>
<p>For example, the HTML writer uses <code><section></code> elements to enclose up sections of the document near headers. If my documents were rendered on some other website that didn’t use/care for <code><section></code> elements, then my document shouldn’t produce them. Likewise, if I were to include a document written by someone else or for some other thing, I’d want it to be rendered with <code><section></code>. <em>Having the document itself specify those structural HTML elements reduces portability and would be a mistake.</em></p>
<p>Not to just shit on the website that I clearly appreciate enough to copy to make froghat.ca, but <a class="reference external" href="https://raw.githubusercontent.com/Eiriksmal/lawler-dot-io/f50616bbf2c6a2126f0a8f41ca6450d3729f92c5/content/2020/on-gitprime.md">lawler.io does this in their Markdown documents</a>. They specify footnotes and sections (and a couple of other things) as HTML in their file. Not only does this reduce portability to document formats other than HTML, but it reduces portability to other websites that might structure or style HTML content differently.</p>
<p>Having a proper extensible document model is a huge win for reStructuredText for these reasons. It’s a big part of why I continue to use it in spite of the quirks I’ve mentioned so far.</p>
</section></section><section id="commonmark-more-like-commonfart-ha-ha-hhha-ahha">
<h2>CommonMark? more like CommonFart ha ha hhha ahha<a class="self-link" title="link to this section" href="#commonmark-more-like-commonfart-ha-ha-hhha-ahha"></a></h2>
<p>At the time of writing, <a class="reference external" href="https://spec.commonmark.org/0.29/">CommonMark’s most recent specification is 0.29</a>.</p>
<p>I tried to look it over, but some of the links on the page didn’t work. It made me really upset. Now, I’m here, writing this, hoping the catharsis will allow me to move on with my life.</p>
<p>Briefly, here’s how links to elements on the same document are <em>supposed</em> to work.</p>
<pre class="code html full-width reflow literal-block"><code><span class="p"><</span><span class="nt">h2</span> <span class="na">id</span><span class="o">=</span><span class="s">"interesting-topic"</span><span class="p">></span>A Heading for an Interesting topic<span class="p"></</span><span class="nt">h2</span><span class="p">></span></code></pre>
<p>The element above is a second-level heading (that’s what <code>h2</code> means) with an <code>id</code> attribute that is <em>unique</em> in that document.</p>
<p>Suppose that heading is on the page in your browser at <code>froghat.ca/blag/markup-rant/</code>. If you then visit <code>froghat.ca/blag/markup-rant/#interesting-topic</code>, your browser will scroll to that heading for you. It’s cool because you can link to specific parts of the page that way.</p>
<p>You can put a link on your page that does this so that when people click on the link it will go to the heading, or whatever element has the corresponding <code>id</code>. This is an example of the HTML for such a link:</p>
<pre class="code html full-width reflow literal-block"><code><span class="p"><</span><span class="nt">a</span> <span class="na">href</span><span class="o">=</span><span class="s">"#interesting-topic"</span><span class="p">></span>Click here to read about an interesting topic!<span class="p"></</span><span class="nt">a</span><span class="p">></span></code></pre>
<p>Here’s <a class="reference internal" href="#commonmark-more-like-commonfart-ha-ha-hhha-ahha">an actual link that goes to the beginning of this section</a>.</p>
<section id="wooooooow-you-can-link-to-things-ur-so-smart-gr8-job-dood">
<h3>wooOOoOow you can link to things, ur so smart, gr8 job dood<a class="self-link" title="link to this section" href="#wooooooow-you-can-link-to-things-ur-so-smart-gr8-job-dood"></a></h3>
<p>Yah okay, internal hyperlinks are not super complicated.</p>
<p>So why does the page for CommonMark’s latest specification (<a class="reference external" href="https://archive.is/lARyB">archive.is link</a>) contain the following paragraph:</p>
<pre class="code html full-width reflow literal-block"><code><span class="p"><</span><span class="nt">p</span><span class="p">></span>We can divide blocks into two types:
<span class="p"><</span><span class="nt">a</span> <span class="na">id</span><span class="o">=</span><span class="s">"container-blocks"</span> <span class="na">href</span><span class="o">=</span><span class="s">"#container-blocks"</span> <span class="na">class</span><span class="o">=</span><span class="s">"definition"</span><span class="p">></span>container blocks<span class="p"></</span><span class="nt">a</span><span class="p">></span>,
which can contain other blocks, and <span class="p"><</span><span class="nt">a</span> <span class="na">id</span><span class="o">=</span><span class="s">"leaf-blocks"</span> <span class="na">href</span><span class="o">=</span><span class="s">"#leaf-blocks"</span> <span class="na">class</span><span class="o">=</span><span class="s">"definition"</span><span class="p">></span>leaf blocks<span class="p"></</span><span class="nt">a</span><span class="p">></span>,
which cannot.<span class="p"></</span><span class="nt">p</span><span class="p">></span></code></pre>
<p>There are two links, each link to … themselves. Because each <code><a></code> element has the <code>id</code> that <em>they</em> target.</p>
<p>So when you click on those links in your web browser, you <em>expect</em> to go to the chapter talking about the topic. Instead, you just go to the paragraph that you’re <em>already</em> on.</p>
<p>There’s <em>also</em> a table of contents that links to the two chapters mentioned above, but now they target these links instead of the actual chapters.</p>
<p>Well what does the chapter look like?</p>
<pre class="code html full-width reflow literal-block"><code><span class="p"><</span><span class="nt">h1</span> <span class="na">id</span><span class="o">=</span><span class="s">"container-blocks"</span> <span class="na">href</span><span class="o">=</span><span class="s">"#container-blocks"</span> <span class="na">class</span><span class="o">=</span><span class="s">"definition"</span><span class="p">></span>
<span class="p"><</span><span class="nt">span</span> <span class="na">class</span><span class="o">=</span><span class="s">"number"</span><span class="p">></span>5<span class="p"></</span><span class="nt">span</span><span class="p">></span>Container blocks
<span class="p"></</span><span class="nt">h1</span><span class="p">></span></code></pre>
<p>Okay, well at least the <code>id</code> attribute is set. But it’s supposed to be unique. If the link didn’t have this <em>same</em> <code>id</code>, it would probably navigate to the chapter like it’s supposed to. Also, this heading has an <code>href</code> attribute for some reason.</p>
<p>There are a lot of things great about this.</p>
<ul>
<li><p>Links are not complicated to get right. You do the <code>href</code> and the <code>id</code>.</p>
<p>I don’t have a big brain for pointing this out. I’m just stroking my ego by drawing it along. (Probably from some deeply held self-resentment that externalizes as being overly-critical of other people’s stuff.)</p>
</li>
<li><p>This is clearly <em>wrong</em> through static analysis. The <code>id</code> is supposed to be unique. And it’s not here.</p>
<p>Run this on <a class="reference external" href="https://validator.w3.org/nu/?doc=https%3A%2F%2Fspec.commonmark.org%2F0.29%2F">validator.w3.org</a> (or use this <a class="reference external" href="https://archive.is/cpwJr">archive.is link</a>) and it will tell you as such.</p>
</li>
<li><p>This is clearly <em>broken</em> if you ever try to click on the links.</p></li>
</ul>
</section><section id="bonus-meme-alert">
<h3>Bonus Meme Alert<a class="self-link" title="link to this section" href="#bonus-meme-alert"></a></h3>
<p>The <a class="reference external" href="https://archive.is/UM9wr">GitHub Flavored Markdown Spec</a>, which appears derived from the CommonMark spec, has this problem as well.</p>
</section><section id="anyway">
<h3>Anyway …<a class="self-link" title="link to this section" href="#anyway"></a></h3>
<p>I tried to understand Markdown because people suggest that it’s easy and simple and good. It’s not.</p>
<p>There are a bunch of different things that people mean when they talk about Markdown because of differences in what markup is supported across implementations. You want tables? You can’t use CommonMark, try GitHub Flavored Markdown. Want footnotes? See Markdown Extra/MultiMarkdown (or something) for that.</p>
<p>Or, you can have <em>whatever you want</em> by writing inline HTML (i.e. not Markdown) because <em>fuck you</em>.</p>
<p>Docutils and reStructuredText is complicated. I’m <em>not</em> overjoyed with it. The specification is large and, unfortunately, there are a lot of ways to use it that end up looking stupid. I <em>wish</em> this wasn’t the case. I wish it was straightforward and had more implementations and it was fun and it made you feel good when you used it.</p>
<p>In spite of that, it delivers and it has a specification that isn’t a stupid fucking busted document with broken links and a broken table of contents and that gets 50 errors and 7 warnings from the W3C Markup Validator. You know what would be an improvement over that?</p>
<img alt="A screenshot of the W3C Markup Validator for the docutils' reStructuredText documentation saying "This document was successfully checked as XHTML 1.0 Transitional! Result: Passed."" src="docutils-w3c-valid.png" />
<p>There are a things about Markdown that I appreciate or prefer. I can see it making a lot of sense for something like email or comments on reddit. Where LaTeX, for example, wouldn’t be most people’s first choice. And maybe that’s when Markdown peaked – when it <em>just</em> did those things. Now that it’s been extended, the tooling & ecosystem is worse.</p>
<p><a class="reference external" href="https://en.wikipedia.org/wiki/JSON">JSON</a> is a pretty bad file format. It’s hugely popular even though it lacks some basic types like dates or datetimes, has some dumb rules about commas, and looks stupid. For configuration files, where comments can be quite important, JSON gives you no help. <a class="reference external" href="https://en.wikipedia.org/wiki/YAML">YAML</a>, is generally considered to be unnecessarily complicated and gets a lot of hate. So now <a class="reference external" href="https://en.wikipedia.org/wiki/TOML">TOML</a> has gotten traction for filling a niche between the two (right on top of where INI used to be).<a class="footnote-reference superscript" href="#footnote-4" id="footnote-reference-4" role="doc-noteref"><span class="fn-bracket">[</span>4<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-4" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-4">4</a><span class="fn-bracket">]</span></span>
<p>And, ultimately, what everyone really ever wanted was <a class="reference external" href="https://en.wikipedia.org/wiki/Extensible_Data_Notation">EDN</a>.</p>
</aside>
</aside>
</aside>
<p>Markdown and reStructuredText feel like JSON and YAML respectively. There are useful things in YAML, like <a class="reference external" href="https://pyyaml.org/wiki/PyYAMLDocumentation#constructors-representers-resolvers">tags</a>, that aren’t replicated in any other of the popular option. Likewise, reStructuredText has some useful concepts that make it my preference a lot of the time. But it’s complicated and everyone hates it.</p>
<p>I just want Rich Hickey to come along and invent something nice and cozy in the middle that works for everything everywhere always so nobody will ever be sad ever again.</p>
</section></section><p>Complaining about Markdown & reStructuredText mostly I guess.</p>
2020-10-10T12:00:00-07:00https://froghat.ca/2020/03/impetuousImpetuous2020-03-12T12:00:00-07:00sqwishy<p>In keeping with the latest trends of rewriting perfectly docile & unsuspecting software in
<a class="reference external" href="https://www.rust-lang.org/">Rust</a>, I have been working on <a class="reference external" href="https://gitlab.com/sqwishy/impetuous-rs/">a Rust implementation</a> of some <a class="reference external" href="https://gitlab.com/sqwishy/impetuous/">old time tracking software</a>
I wrote in Python some years ago named Impetuous.</p>
<p>The short summary is that there are there are a lot of things wrong with how the
rewrite ended up, but it can do some pretty cool things and I learned a lot.</p>
<p>A lot of the badness seem to be from where
I abused macros because I couldn’t figure out how to use Rust traits to do the
thing I wanted;
and there are a few places where stuff makes no sense
because an effort to follow <em>paradigms</em> like <abbr title="keep it simple, stupid">KISS</abbr>/<abbr title="you aren't gonna need it">YAGNI</abbr><a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a> resulted in too much
coupling/things in the system having to care about everything else.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>Not to say that simplicity is bad or whatever. But when I have conversations or hear
people make normative statements that involve simplicity as a predicate, the
rhetoric is about the virtue of simplicity, while the actual
question is whether something is or isn’t simple.<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a></p>
<p>My upset is more to do with the design-by-principle principle wherein decisions are
made by <strong>vaguely</strong> applying abstracted & generalized value statements with
catchy acronyms like <abbr title="don't repeat yourself">DRY</abbr> and <abbr title="basically just separation of concerns (SoC) expressed in five different ways">SOLID</abbr>.</p>
<p>These concepts are necessarily low-resolution and they leave a vacuum of detail
that is then filled in by whoever gets to decide what simplicity means (whoever has
the status).</p>
</aside>
</aside>
</aside>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>And <em>maybe</em> simplicity isn’t always <em>that</em> good either. Like when you learn physics
and you model objects as having no volume, no mass, in a frictionless gas.
Maybe that’s better for learning rather than application? I don’t really know.</p>
</aside>
</aside>
</aside>
<p>In all of this, one of the most responsible things I can say about Rust is that it seems
to be relatively good at managing the complexity in how your program works at the
expense of the complexity of how Rust works.</p>
<p>If you are writing something that must be thread-safe, is it easier to reason about
threading <em>or</em> about Rust and its borrow checker and how the borrow checker works in
different places like traits or closures or any other feature in Rust?</p>
<p>I could spend a lot of time writing stuff that nobody will read about Rust; but, what I
want to cover here are the goals I had in mind to achieve and how that influenced
decisions that lead to why the Impetuous is the way that it is.</p>
<section id="im-doing">
<h2><code>im doing</code><a class="self-link" title="link to this section" href="#im-doing"></a></h2>
<p>Briefly, Impetuous was written to help me track my time spent at work and post that time
to Jira and Freshdesk.</p>
<pre class="code full-width literal-block"><code>$ im doing looking for my red stapler
7:30:00 / 0s / --:--:-- looking for my red stapler
$ im doing plotting against whoever took my red stapler
8:15:00 / 0s / --:--:-- plotting against whoever took my red stapler
7:30:00 / 45m 0s / 8:15:00 looking for my red stapler
$ im doing
8:15:00 / 15m 0s / 8:30:00 plotting against whoever took my red stapler</code></pre>
<p>The Python implementation had two tables: one for these time
entries, another for the results of posting time to Jira or Freskdesk.
It’s pretty basic stuff.</p>
</section><section id="over-engineering-pretty-basic-stuff">
<h2>Over-engineering pretty basic stuff<a class="self-link" title="link to this section" href="#over-engineering-pretty-basic-stuff"></a></h2>
<p>When writing the first version, there was an attempt to model interactions
between a user agent and the data store through <strong>data structures</strong> that described a way
to <em>gather</em> and <em>change</em> or <em>present</em> information in the database.</p>
<p>The hope was that composing those data structures would be an efficient & effective way to
describe how to interact with Impetuous in terms of <abbr title="Create Read Update Delete">CRUD</abbr>.</p>
<p>Using some context, like access control, these structures would be mapped into one or
more SQL queries and run against a <a class="reference external" href="https://www.sqlite.org/index.html">SQLite</a> database, ultimately returning a document
containing objects gathered in the shape described by the presentation structure.</p>
<p>In Python, I was spoiled by <a class="reference external" href="https://www.sqlalchemy.org/">SQLAlchemy</a>, which I found to be a very nice library for
querying a SQLite database. In Rust, the only mature
option I found was <a class="reference external" href="http://diesel.rs/">Diesel</a>, which boasts compile-time correctness at the cost of allowing
queries to be dynamically generated at runtime. When I evaluated it, the Diesel API was
so strict that every query was required to know at compile-time exactly the number of
columns it would return. Conditionally selecting
columns or joining other tables based on some <em>gather</em> or <em>shape</em> in a request would
require that Impetuous generate every permutation of a request at compile-time.</p>
</section><section id="over-engineering-query-building">
<h2>Over-engineering query building<a class="self-link" title="link to this section" href="#over-engineering-query-building"></a></h2>
<p>I ended up writing my own thing for building SQL query strings and used a very nice
driver called <a class="reference external" href="https://github.com/jgallagher/rusqlite">rusqlite</a>. This went through about two and a half
rewrites. It looks something like:</p>
<pre class="code rust full-width literal-block"><code><span class="c1">// Honestly writing "VALUES (?, ?, ?, ?, ?)" is more readable than this...
</span><span class="kd">let</span><span class="w"> </span><span class="n">expr</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">sql</span>::<span class="n">Values</span>::<span class="n">one</span><span class="p">(</span><span class="n">sql</span>::<span class="n">ExprTuple</span>::<span class="n">from</span><span class="p">((</span><span class="w">
</span><span class="n">sql</span>::<span class="n">bind</span>::<span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">(),</span><span class="w">
</span><span class="n">sql</span>::<span class="n">bind</span>::<span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">(),</span><span class="w">
</span><span class="n">sql</span>::<span class="n">bind</span>::<span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">(),</span><span class="w">
</span><span class="n">sql</span>::<span class="n">bind</span>::<span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">(),</span><span class="w">
</span><span class="n">sql</span>::<span class="n">bind</span>::<span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">(),</span><span class="w">
</span><span class="p">)));</span><span class="w">
</span><span class="n">sql</span>::<span class="n">Insert</span>::<span class="n">default_tuples</span><span class="p">()</span><span class="w">
</span><span class="p">.</span><span class="n">table</span><span class="p">(</span><span class="o">&</span><span class="n">postings</span>::<span class="n">table</span><span class="p">)</span><span class="w">
</span><span class="p">.</span><span class="n">columns</span><span class="p">(</span><span class="n">sql</span>::<span class="n">ident</span><span class="p">(</span><span class="s">"doing_tag"</span><span class="p">))</span><span class="w">
</span><span class="p">.</span><span class="n">columns</span><span class="p">(</span><span class="o">&</span><span class="n">postings</span>::<span class="n">source</span><span class="p">)</span><span class="w">
</span><span class="p">.</span><span class="n">columns</span><span class="p">(</span><span class="o">&</span><span class="n">postings</span>::<span class="n">entity</span><span class="p">)</span><span class="w">
</span><span class="p">.</span><span class="n">columns</span><span class="p">(</span><span class="o">&</span><span class="n">postings</span>::<span class="n">status</span><span class="p">)</span><span class="w">
</span><span class="p">.</span><span class="n">columns</span><span class="p">(</span><span class="o">&</span><span class="n">postings</span>::<span class="n">doing</span><span class="p">)</span><span class="w">
</span><span class="p">.</span><span class="n">expr</span><span class="p">(</span><span class="n">expr</span><span class="p">)</span><span class="w">
</span><span class="p">.</span><span class="n">upsert</span><span class="p">(</span><span class="n">sql</span>::<span class="n">lit</span>::<span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">(</span><span class="s">"ON CONFLICT DO NOTHING"</span><span class="p">))</span><span class="w">
</span><span class="p">.</span><span class="n">as_sql</span><span class="p">();</span><span class="w">
</span><span class="c1">// INSERT INTO "postings" ("doing_tag", "source", "entity", "status", "doing")
// VALUES (?, ?, ?, ?, ?) ON CONFLICT DO NOTHING</span></code></pre>
<p>Many of the structures, including <code>Select</code>, <code>Insert</code>, <code>Update</code>, and <code>Delete</code>, use
generics for almost every part of the statement, allowing users quite a bit of
flexibility around what types they’re building queries out of.
This also gives some control to users to determine the compile/run-time strictness of
values permitted.<a class="footnote-reference superscript" href="#footnote-3" id="footnote-reference-3" role="doc-noteref"><span class="fn-bracket">[</span>3<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-3" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-3">3</a><span class="fn-bracket">]</span></span>
<p>In one case, I used the generics to require result columns to be a column
on an aliased table where column and table identifiers were defined at
compile time. Using this type forbid referencing a table that was not aliased.</p>
</aside>
</aside>
</aside>
<p>In a statement like <code>SELECT</code>, you may have multiples of things like result columns,
<code>FROM</code> tables, or <code>JOIN</code> clauses. We build up the statement by accumulating values for
these parts into the statement structure.</p>
<p>If, for each collection, whether it’s result columns or joins or whatever,
we accumulate elements into a vector, then each element must be of the same type.
Or we can accumulate these into a tuple of differently typed elements; <code>(T1, T2,
T3, ...)</code>. But every time we grow the tuple, it becomes a new type. In Rust, this is
done through a <em>move</em>, allowing us to create a new type by consuming our ownership of
the old one.</p>
<p>Since each has their own trade-offs, I supported both and use one or the other
depending on the circumstances.</p>
<p>In the example earlier, the first call to columns <em>moves</em> the <code>Select</code> in order to change
its columns type from the unit type <code>()</code> to a single element tuple of an owned
identifier <code>(sql::Ident,)</code>. The second call <em>moves</em> it again to change the type to a
<code>(sql::Ident, &sql::Ident)</code>. The owned and borrowed identifiers are different types, so
we couldn’t do this with a vector – unless we borrowed the first identifier, or copied the
later ones, or used a clone-on-write or some kind of union thing with a variant for each
type.</p>
<p>On the other hand, if you use tuples, then I guess you can’t really do anything conditional,
since an <code>INSERT</code> of a tuple of two columns is a <em>different type</em> than an <code>INSERT</code>
of a tuple of three columns and they cannot be assigned to the same thing.
Allegedly, we can use a <code>Box<dyn T></code> or <code>&dyn T</code> (where <code>T</code> is a particular trait),
instead of a vector or a tuple, to point to some owned or borrowed place in memory that
implements whatever trait that the tuple or vector is implementing.
But there are a lot of caveats and lore around this that makes it difficult to employ
reliably.</p>
</section><section id="over-engineering-object-mapping">
<h2>Over-engineering object mapping<a class="self-link" title="link to this section" href="#over-engineering-object-mapping"></a></h2>
<p>The place where everything really fell apart was with
the structure and API for querying values from the database and loading them into
structures in the program.</p>
<p>I started with defining a bit of schema, like well known table identifiers
and column identifiers that can be used and to build queries.
I wrote a macro to help with this.
Then I needed a few more things like filter expressions, column selection, joining
parent tables, and loading rows from child tables. Ultimately, the macro ended up
generating a lot of code responsible for this stuff because, every time I wanted to
implement it generally, something went wrong – often with how I understand how traits
worked.</p>
<p>Also, the generated code can only load rows into a single structure/type per
table. And the structure is the same thing used to serialize and deserialize objects
to/from the body of an HTTP request or wherever they’re going. So that structure is well
suited for that purpose but really hard to use everywhere else.</p>
<p>It does support loading related records. So, if you want to find articles
<em>and</em> their authors, it will include the authors in the query with a SQL <code>JOIN</code> and load
them. Or, if you want to load all the comments for those articles, it use the primary
keys for the articles you asked for to run another query to fetch comments for those
articles. So that’s pretty cool.</p>
<p>But, by the end, other features were missing, it was awkward to use, and unsound
to improve on due to the things mentioned above.
This really limited the library’s usefulness with future applications.</p>
</section><section id="over-engineering-state-transfer">
<h2>Over-engineering state transfer<a class="self-link" title="link to this section" href="#over-engineering-state-transfer"></a></h2>
<p>So the <em>gather</em> and <em>shape</em> stuff from earlier ended up being something with its own
stringy syntax. I think I was emboldened by interesting formats for filtering and
fetching via an HTTP request URL, like those featured in <a class="reference external" href="http://htsql.org">HTSQL</a> and <a class="reference external" href="https://github.com/PostgREST/postgrest">PostgREST</a>:</p>
<ul class="full-width simple">
<li><p>HTSQL filter: <code>/program?school.code='bus'&degree!='bs'</code></p></li>
<li><p>HTSQL join: <code>/course{department{school.name, name}, title}</code></p></li>
<li><p>PostgREST filter: <code>/people?age=gte.18&student=is.true</code></p></li>
<li><p>PostgREST join: <code>/films?select=title,director:directors(id,last_name)</code></p></li>
</ul>
<p>I ended up with something in the middle; except I didn’t think it should support literals.
Instead, if you want to write a filter expression that compares a field to a
user-provided value, supply a parameter name and pass the value in an accompanying
document, like the request body over HTTP.</p>
<pre class="code shell full-width literal-block"><code>http<span class="w"> </span><span class="s1">'localhost:7193/doings{start,end,note?text.eq.$text}'</span><span class="w"> </span><span class="se">\
</span><span class="w"> </span><span class="nv">text</span><span class="o">=</span><span class="s2">"looking for my red stapler"</span><span class="w">
</span><span class="o">[</span><span class="w">
</span><span class="o">{</span><span class="w">
</span><span class="s2">"end"</span>:<span class="w"> </span><span class="s2">"2020-03-09T15:15:00Z"</span>,<span class="w">
</span><span class="s2">"note"</span>:<span class="w"> </span><span class="o">{</span><span class="w">
</span><span class="s2">"text"</span>:<span class="w"> </span><span class="s2">"looking for my red stapler"</span><span class="w">
</span><span class="o">}</span>,<span class="w">
</span><span class="s2">"start"</span>:<span class="w"> </span><span class="s2">"2020-03-09T14:30:00Z"</span><span class="w">
</span><span class="o">}</span><span class="w">
</span><span class="o">]</span></code></pre>
<p>And filters may include <em>and</em> as well as <em>or</em>, like:</p>
<p><code>and(start.lt.$until, or(end.is.null, end.ge.$since))</code></p>
<p>Although, I just contradicted myself here because <code>null</code> is a literal value.
To be honest, so are <code>true</code> and <code>false</code>.
But those are the only other exceptions, <em>I promise</em>.
The idea here is just that, if a lack of literals
leads to queries that look like <code>field.is.$null</code> or <code>field.is.$true</code> and
the parameter name doesn’t really mean or add anything, then that’s not good.
So that’s the way it is I guess.</p>
<p>So the <em>gather</em> structure here is, roughly,
some resource identifier followed by zero or more of:</p>
<ul class="simple">
<li><p>filter: <code>? expires_on.le.$today</code></p></li>
<li><p>shape: <code>{text, start, end}</code></p></li>
<li><p>order: <code>sort(-start)</code></p></li>
<li><p>limit: <code>..$n</code> or <code>..100</code><a class="footnote-reference superscript" href="#footnote-4" id="footnote-reference-4" role="doc-noteref"><span class="fn-bracket">[</span>4<span class="fn-bracket">]</span></a></p></li>
</ul>
<aside class="aside">
<aside class="footnote superscript" id="footnote-4" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-4">4</a><span class="fn-bracket">]</span></span>
<p>okay, I broke my promise, but that is the last place where you can have a literal, I really mean it this time</p>
</aside>
</aside>
</aside>
<p>The shape expressions is a sequence of <em>gather</em> structures.
So you can be a bit recursive.
But only to fields that link to other tables.</p>
<p>And there are a few other confusing things like:</p>
<ul class="simple">
<li><p>If you ask for <code>articles {author..0}</code> you can limit the author but it doesn’t do
anything and doesn’t make sense given every article has exactly one author.</p></li>
<li><p>Is <code>articles {comments..10}</code> supposed to show at most ten comments in
total or per article?</p></li>
</ul>
<section id="why-not-graphql">
<h3>Why not GraphQL?<a class="self-link" title="link to this section" href="#why-not-graphql"></a></h3>
<p>I don’t think it would have helped. I spent a short amount of time on implementing the syntax.
Most of the work was making the query building & object loading stuff.
Which I did because I couldn’t use Diesel.<a class="footnote-reference superscript" href="#footnote-5" id="footnote-reference-5" role="doc-noteref"><span class="fn-bracket">[</span>5<span class="fn-bracket">]</span></a>
So I would have had to do that with GraphQL.<a class="footnote-reference superscript" href="#footnote-6" id="footnote-reference-6" role="doc-noteref"><span class="fn-bracket">[</span>6<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-5" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-5">5</a><span class="fn-bracket">]</span></span>
<p>Maybe a better question…
<em>Q:</em> Why not adjust the program’s requirements so I could use Diesel?
<em>A:</em> im smol brains</p>
</aside>
</aside>
</aside>
<aside class="aside">
<aside class="footnote superscript" id="footnote-6" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-6">6</a><span class="fn-bracket">]</span></span>
<p>What I mean is, GraphQL doesn’t tell you what API you should offer. It’s basically an
RPC format, as far as I can tell.
I mentioned this a bit in a <a class="reference external" href="/blag/content">earlier post</a> titled <cite>Content</cite>.</p>
<p>But you <em>should not</em> read my earlier post, instead you should read about the motivations of the
<a class="reference external" href="https://www.graphiti.dev/guides/why">Graphiti</a> library. It’s better written, more detailed, and has pictures.</p>
</aside>
</aside>
</aside>
<p>Also, GraphQL is a fair bit more complicated than the sort of thing I needed. It has
more features – if you want to look at it that way. There were two things I
intentionally designed for.</p>
<ol class="arabic">
<li><p>I want type inference for variables.</p>
<p><a class="reference external" href="https://graphql.org/learn/queries/#variables">Using variables in GraphQL</a> requires a bunch of extra stuff like specifying the
<em>operation name</em> and <em>operation type</em>, as well as the <em>variable type</em> for each
variable you’ve defined.</p>
<p>Can’t the type be inferred? It’s a bother to be explicit about it and I complain easily.</p>
</li>
<li><p>I don’t want support for literals.</p>
<p>Because I don’t want to compete with actual serialization formats.</p>
<p>Except, as mentioned earlier, maybe in cases where that encourages the use of
parameters named after their literal expression, like <code>$true</code> or <code>$null</code>.</p>
</li>
</ol>
</section><section id="mutations">
<h3>Mutations<a class="self-link" title="link to this section" href="#mutations"></a></h3>
<p>This complicated DSL thing only supports fetching data, not mutating it.</p>
<p>I had intended to include syntax for this using <code>+</code> or <code>!</code> operators.
There were a few iterations on this but it ended up making not a lot of sense as a
feature.
I had two requirements:</p>
<aside class="aside">
<p>The second point above, about serialization, makes a lot more sense in the context
of mutations. Without them, the parameter values are just simple types like
strings, dates, or numbers.</p>
</aside>
<ol class="arabic simple">
<li><p>Mutations should be batchable. If I wanted to write a request that adds, updates,
or removes one record, the same request should work to modify multiple records if I
submit a sequence of items instead of a single item for the parameter values.</p></li>
<li><p>It was important to me to have some sort of optimistic concurrency
control. In this case, mutations must include, for each value they change, what they
believe the current value is.</p></li>
</ol>
<p>A consequence of the last point is that,
if I try to delete a user with <code>users ! $bob</code> then <code>$bob</code> is the whole user and it must
match the current bob.
Similarly, I can insert with <code>users + $bob</code>.
But what does an update look like? Is it the same as an insert but where <code>$bob</code> is a
old-new pair of bobs instead of one bob? That seems a little weird to me for some
reason.</p>
<p>Even if disregard that, these operators struggle to be composed with each other or with any
of the operators for fetching. Like lets try to figure out what the following mean:</p>
<ul>
<li><p><code>articles ? title.eq.$title !</code></p>
<p>This probably deletes articles with that title.</p>
<p>But this kinda skips skips concurrency control, so you don’t really know what you’re
deleting.</p>
</li>
<li><p><code>articles ? title.eq.$title {title + $new_title}</code></p>
<p>This probably sets the title to <code>$new_title</code> for each article with the title <code>$title</code>.
How do I do this in batch? If <code>$new_title</code> is a sequence, which element do I use?</p>
<p>Maybe <code>$title</code> is also a sequence of the same length and I’m mapping between the
sequences?</p>
<p>Or we use <code>$title.old</code> and <code>$title.new</code> as parameters instead and <code>$title</code> is a
sequence of objects with <code>old</code> and <code>new</code> keys?</p>
<p>Or let <code>$title</code> be a map instead of a sequence that maps old to new titles. And come
up with some syntax for using that. But some popular serialization formats only
support maps with strings for keys.</p>
</li>
<li><p><code>articles ? title.eq.$title {comments + $comment}</code></p>
<p>This reads that we should insert <code>$comment</code> for every article with the given title.
But what if <code>$comment</code> includes a specific value for its article?
Is it used, ignored, or an error?</p>
<p>And this has the same problem as above in terms of being batchable.</p>
</li>
</ul>
<p>At any rate, I realized that none of these examples are things I had <em>any desire</em> for be
able to do.</p>
<p>And anyway, the point in the beginning was to use data structures to describe a request.
They could be serialized in the same way that the parameter values would be. And you
don’t even need parameter values because you don’t need parameters because you
don’t need to avoid interpolating data into the query because it’s all one serializable
data structure already. So any problems to do with parameters only exist in this weird
DSL format.</p>
<p>As it is now, the mutations doesn’t use this fancy query stuff at all. It takes a sequence
of objects that specifies what collection to operate on, the old item, and the new item.
For deletions, new is null. For insertions, old is null.</p>
<p>It’s not very fancy.
But some of the long term goals are to do with federation and the way it works now makes
a lot more sense for synchronization.</p>
<p>Also, in the middle of this project, I changed my views on some stuff after learning
about interesting things like Datalog, Datomic, and <a class="reference external" href="https://github.com/mit-pdos/noria">Noria</a> and some of the technologies
surrounding those things.
And now I wonder if user-defined materialized views interpreted from some kind of
declarative language is a formidable design for communication between software modules.</p>
<p>I want to write & play more about that, but this post is quite long already so that’ll
have to be something for future me.</p>
</section></section><p>Reflection on rewriting my <a class="reference external" href="https://gitlab.com/sqwishy/impetuous/">old time tracking software</a> in <a class="reference external" href="https://www.rust-lang.org/">Rust</a>.</p>
2020-03-12T12:00:00-07:00https://froghat.ca/2020/09/phone-rantPhone Rant2020-09-26T12:00:00-07:00sqwishy<p>My first phone was a <a class="reference external" href="https://en.wikipedia.org/wiki/Nokia_3310">Nokia 3310</a>. It had a monochrome LCD display and a keypad with
physical buttons. It was a very popular phone.</p>
<p>My second phone was a <a class="reference external" href="https://en.wikipedia.org/wiki/Nokia_3200">Nokia 3200</a>. It had a terrible camera, a colour display, and a keypad
with physical buttons.</p>
<aside class="aside">
<figure class="figure">
<img alt="A photograph of someone else's Nokia 3200 -- a candy bar shaped cell phone." src="nokia3200.jpg" />
<figcaption>Someone else’s Nokia 3200</figcaption>
</figure>
</aside>
<aside class="aside">
<figure class="figure">
<img alt="A Nokia 3200, with its case off, showing the front and back face plates." src="nokia3200-bits.jpg" />
<figcaption>A Nokia 3200, with its case off, showing the front and back face plates.</figcaption>
</figure>
</aside>
<p>A super cool thing about that phone was that the case was a see-through plastic, so you
could change how the shell looked by replacing whatever was underneath.
There were a number of pre-made face-plates you could use, but you could also make your
own using a physical tool that came with the phone.</p>
<p>Teenage me enjoyed that.</p>
<section id="the-keypad-t9">
<h2>The Keypad & T9<a class="self-link" title="link to this section" href="#the-keypad-t9"></a></h2>
<p>With both phones, the main keypad had a normal twelve button dialpad; ten for
numbers and two for special buttons. Keys <code>2</code> through <code>9</code> each mapped to three
or four letters and I think <code>0</code> mapped to a space.</p>
<p>One way to write with the keypad was to type letters in individually, hitting the
same number repeatedly to cycle through the letters that each key would map to. So <code>7</code> on
the keypad maps to <em>p</em>, <em>q</em>, <em>r</em> and <em>s</em>. Hitting <code>7</code> four times would get an <em>s</em>.</p>
<p>Alternatively, the phones had a predictive text feature called <a class="reference external" href="https://en.wikipedia.org/wiki/T9_%28predictive_text%29">T9</a>. For each letter in a word, you
hit the button <em>once</em> for the group containing the letter you want. As you type, the
phone guesses what word you want based on what matched the groups you used.
Occasionally, after typing in all the letters, it still didn’t show the word you
wanted. But, you could hit some special key to cycle through matching words until you found it.</p>
<p>Below is an illustration of the predictive text feature featured in some Nokia documentation somewhere.</p>
<picture class="no-upscale">
<source media="(prefers-color-scheme: dark)" srcset="nokia3310-predictive-dm.png">
<img alt="A sequence of button inputs and the resulting predicted word." src="nokia3310-predictive.png" />
</picture>
<p>I think, over time, it might “remember” what words you tended to use and would prefer those.
But, users also would learn how many times they’d need to cycle to get to the desired word.
Typically, you didn’t have to cycle at all or maybe once or twice for smaller words.</p>
<aside class="aside">
<p>I’m pretty sure there was some sort of auto-complete feature too that worked
similar to how it does in modern smart phones.</p>
</aside>
<p>Once you became familiar with it, this feature and the physical keyboard made it
possible to type text fairly quickly into your phone often without needing to
even really look at it.</p>
<p>It was kinda cool.</p>
</section><section id="my-smartphone">
<h2>My Smartphone<a class="self-link" title="link to this section" href="#my-smartphone"></a></h2>
<p>I’ve had two smartphones in total. Both are second-hand. My current was first released
in 2014.<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>One reason this works out for me is that, while the phone manufacturer has stopped
releasing updates for phone’s Android-based operating system, random people on the
internet take it upon themselves to create and distribute operating system images
for this phone that are based on newer versions of Android.</p>
<p>There’s a conversation that could be had about how that interacts with planned
obsolescences and the environmental cost of manufacturing and discarding
smartphones.</p>
</aside>
</aside>
</aside>
<p>My phone has a quad-core CPU, some kind of GPU, and 2 GB of RAM. But it doesn’t feel
like it sometimes.</p>
<p>I generally try to keep thing some degree of “simple”<a class="footnote-reference superscript" href="#footnote-2" id="footnote-reference-2" role="doc-noteref"><span class="fn-bracket">[</span>2<span class="fn-bracket">]</span></a> where I can. Everyone has a different idea of what this means. For me, one thing is that I try not to run/use JavaScript on the web when I don’t need to.</p>
<aside class="aside">
<aside class="footnote superscript" id="footnote-2" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-2">2</a><span class="fn-bracket">]</span></span>
<p>But since people think of simplicity in different ways. People sometimes come to dumb conclusions and do dumb things in that pursuit. Maybe I’m dumb too.</p>
</aside>
</aside>
</aside>
<p>But sometimes, it seems like I have no idea how computers work and how fast they can go. <a class="reference external" href="https://lunduke.com/posts/2020-05-11/">Slack’s input latency</a> on my laptop is visibly uncomfortable, even compared to other JavaScript “apps”. My laptop is two years old and has an i7; but that shouldn’t matter. Computers from the previous century were capable of handling text input from a user and displaying it back to them.</p>
<p>If I visit any article on medium.com on my phone, the phone is completely unresponsive for about eight seconds before it gets my input to close the tab and returns to normal. What is the point of that?</p>
<p>I don’t think I’m missing out on much by not visiting that website – and it’s a fairly privileged problem to have in the first place – but I certainly think that this sort of thing, including performance issues with slack, affect my/our intuitions of what to expect out of computers & what neat things we can do with them.</p>
<p>That’s a bad distortion to have.</p>
<p>T9 is a different kind of tool than those that make up the modern web. Software that turns a twelve-key dial-pad into an efficient text input device that my mom can use is a <em>fundamentally</em> different (and more valuable) kind of invention than a website that messes with how your browser scrolls on the page.</p>
</section><p>Complaining about phones I guess.</p>
2020-09-26T12:00:00-07:00https://froghat.ca/2020/06/pizzaPizza2020-06-06T12:00:00-07:00sqwishy<p>Years ago,
I received a levain (sourdough starter) named <em>Edgar Allan Dough</em> from <a class="reference external" href="https://nickhuber.ca">a friend</a>’s
spouse.</p>
<p>Since then, I have been using it to bake pizza and <a class="reference external" href="pizza/20200425_0010.jpg">bread</a>.</p>
<p>I’ve incinerated hundreds of pizzas.
After much suffering, I’ve found a baking process that seems to work mostly okay.
It is documented here for reference.</p>
<p>I use a <em>pizza stone</em>, an oven <em>broiler</em>, and <em>fresh mozzarella</em> with 60% moisture and 20% fat content.
The kind of cheese that is delicious and soft enough to break apart with your fingers.
This is important because regular mozzarella will burn under the broiler and fresh
mozzarella will liquefy without enough heat from something like a broiler.</p>
<p>I use an alternation of a recipe for levain pizza dough found in Ken Forkish’s book
<cite>The Elements of Pizza</cite>.</p>
<p>The instructions for the recipe can be found in that book.
To produce three 300g-ish doughs, I use the following amounts:</p>
<table>
<tbody>
<tr><td><p>Water</p></td>
<td><p>200g</p></td>
</tr>
<tr><td><p>Salt</p></td>
<td><p>14g</p></td>
</tr>
<tr><td><p>Levain starter</p></td>
<td><p>250g</p></td>
</tr>
<tr><td><p>White flour<a class="footnote-reference superscript" href="#footnote-1" id="footnote-reference-1" role="doc-noteref"><span class="fn-bracket">[</span>1<span class="fn-bracket">]</span></a></p></td>
<td><p>400g</p></td>
</tr>
</tbody>
</table>
<aside class="aside">
<aside class="footnote superscript" id="footnote-1" role="note">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#footnote-reference-1">1</a><span class="fn-bracket">]</span></span>
<p>Half (200g) of the flour used is <a class="reference external" href="https://en.wikipedia.org/wiki/00_Flour">00</a></p>
</aside>
</aside>
</aside>
<p>The difference between this and the original recipe is a decrease in water and an increase in flour.
I prefer a dryer dough because I find it easier to work with, I enjoy a crispier crust, and
my baking times are relatively short so there is less time for water to leave the dough
as it turns into pizza.</p>
<p>When making the dough, I will tend to add a bit (~20g) of additional flour early on in
the process.
This is partly out of a need to apply flour to deal with the stickiness when working with the dough,
but it’s also an excuse to add more flour.
Adding more flour at the very beginning is tricky since moisture is needed to properly incorporate the flour
and moisture is more evenly distributed twenty or thirty minutes after mixing.</p>
<p>My <em>process</em> for baking is roughly something like this:</p>
<ol class="arabic">
<li><p>If the dough was refrigerated, I take it out a couple hours prior to the next step so that
it’s at room temperature by the time I use it.</p></li>
<li><p>A pizza stone is in my oven (centered under the broiler) and I set the oven to broil
at 260°C, the maximum.</p>
<p>My oven is electric. Gas ovens will behave differently. I’ve only used a gas oven once for
this, but it worked out really well following similar steps (skipping step six).</p>
<p>My oven has a convection and regular mode for broiling, I use the regular broil.</p>
</li>
<li><p>Ten minutes after turning the oven on, I <em>begin</em> shaping the dough ball into a pizza
shape on my floured counter-top.
This is done <em>incrementally</em>. I start shaping, but let the dough rest and revisit it
every five minutes or so, depending on how stretchy the dough is.</p>
<p>Even if the dough is shaped,
it is important to play with it every so often to keep it from sticking on
its work surface. Even though the surface will be floured, the dough may absorb the
flour and begin to stick depending on how wet the dough is.</p>
</li>
<li><p>Twenty five minutes later, the temperature of the pizza stone is above 340°C measured
with an infrared thermometer.
This is near the maximum temperature the stone can reach before
my broiler starts turning off, so I usually assemble my pizza and bake it at this
point.</p>
<p>I place the pizza shaped dough on a wooden paddle. The paddle is lightly dusted with
flour and maybe cornmeal to allow the pizza to easily slide onto the stone later.</p>
<p>Toppings find their way onto the pizza shaped dough entirely of their own volition.</p>
</li>
<li><p>With paddle in hand, I open the oven and slide the assembled uncooked pizza off the
paddle and onto the pizza stone. Care is taken to ensure that the dough ends up
aligned under the broiler and that the toppings remain on the dough.</p></li>
<li><p>I switch the oven mode from regular broil to to convection broil, at the same
temperature (260°C).</p></li>
<li><p>After forty-five to sixty seconds,
I switch the oven back to broil and watch it carefully until it looks done; usually
about three minutes.</p>
<p>If you look away, or blink, the pizza will instantly incinerate.</p>
</li>
</ol>
<p>As an <em>alternative</em> to step six, I may set the oven to bake or turn it off entirely.
Sometimes I switch it away from broil a minute or a half <em>before</em> throwing the
pizza in, and then switch <em>back</em> to the broiler only after a half minute or less.</p>
<p>Or sometimes I don’t change it and just leave it in broiler mode.</p>
<p>The pizza must be under the direct heat of the broiler for long enough so that
the crust and the mozzarella get little burnt spots.</p>
<p>And, the pizza must be on the stone for enough time for the bottom to become crispy.</p>
<p>But, with this cheese, if the sauce is quite thin and the pizza is in the oven too
long at too low a heat, the cheese turns into cream and mixes with the pizza sauce
and you end up with a weird rosé soup all over your pizza.</p>
<p>The idea is to start cooking with a bit less heat on the top of the pizza near the
beginning in order to buy time for the bottom to cook.</p>
<p>For me, turning off the broiler is a dangerous game since it can take an indeterminate
time for my oven to decide to switch it back on.
So, leaving it on and switching to <em>convection</em> seems to be
the most <em>stable</em> option.</p>
<p>But, I’ve had better and worse outcomes trying variations on this.
It seems to depend on how hot the stone and oven is, how my oven works, and how wet or
dry the dough is.</p>
<aside class="aside">
<p>And there’s always a bunch of other stuff that can throw the whole thing off even
before baking, like overproofing dough and having too runny of a sauce.</p>
</aside>
<p>These are all variable every time you bake and subject to the preferences of the baker.
I enjoy thin and a near toast-like crispiness. But some people are wrong and prefer
other things.</p>
<figure class="figure">
<a class="reference external image-reference" href="20200605_0024.jpg"><img alt="pizza with kale, mozzarella, and thinly sliced farmer sausage" src="20200605_0024.jpg" /></a>
<figcaption>pizza with kale, mozzarella, and <em>thinly</em> sliced farmer sausage</figcaption>
</figure>
<figure class="figure">
<a class="reference external image-reference" href="20200605_0050.jpg"><img alt="tomato sauce, bazel, mozzarella, thinly sliced farmer sausage" src="20200605_0050.jpg" /></a>
<figcaption>tomato sauce, bazel, mozzarella, <em>thinly</em> sliced farmer sausage</figcaption>
</figure>
<p>How I burn my pizza.</p>
2020-06-06T12:00:00-07:00