HTML as a configuration file format

Fri Jun 7 '24

This post is an excuse to talk about Civilization IV again. The title is clickbait, so here is a summary.

Firefox lets you save web pages through a menu button labeled Save Page As… or by pressing Ctrl+S. This saves the current value of <input> fields on the page to an HTML file[1]. When you open the saved page in your web browser, the saved values are right there. It’s like editing any regular file on your computer. Open it, modify it, and save it – when you reopen the file later, the modified values are still there. So retro.

A document like this is interesting because it can be data for another program and, compared to a typical configuration file, a more graphically rich way to edit & interact with the data in the file.

Software configuration files often use a format with a simple syntax that resembles plain text – and there are plenty of good reasons to do that. But modern web browsers can show images, have built-in calendars and time pickers or widgets for other data types, and can run JavaScript that can be used to show visualizations based on provided values or extend interactivity in other ways (like show popups and cookie consent banners). Used with discretion, some users might find this a more approachable and useful editing experience than editing text files.

So the pitch here is that, instead your program reading configuration from keys and values in an INI, TOML, or JSON file or something, it reads an HTML file and uses the values of <input> elements. Or, get real wacky with it and use other parts of the document, like images from <img> elements. And design the HTML file to make use of the browser to provide a rich editing experience for modifying the document itself.

In the rest of this blog post, I talk about an example of this kind of thing in a tool I made as part of a small edit to a turn based strategy game from 2005, Civilization IV.

GameFont.tga

Most of the text in Civilization IV is rendered from a TTF file. It’s easy to modify the game’s files to use a different font or, since the font is made of vectors, draw text at a larger font size. It’s like how fonts work everywhere else – like in a word process or web browser.

But the text rendered on city bar comes from an atlas called GameFont.tga.

This is what a city looks like in game. The city bar is the graphic at the bottom with the numbers on it; showing the city status line (like religions present), the city size, the city name, and the current production.
This is what a city looks like in game. The city bar is the graphic at the bottom with the numbers on it; showing the city status line (like religions present), the city size, the city name, and the current production.
This is the vanilla GameFont.tga file. (Beige checkerboard added in to show transparency.) The text used in rendering the city bar above is taken from this file.
This is the vanilla GameFont.tga file. (Beige checkerboard added in to show transparency.) The text used in rendering the city bar above is taken from this file.

If you change the GameFont.tga image, you can change the text shown in city bar.

The city bar shown again but using BBC Reith. (I think it’s also bold?) The change is a bit subtle.
The city bar shown again but using BBC Reith. (I think it’s also bold?) The change is a bit subtle.
gamefont-reith.png

To help do this, there are two programs.

  • One, to generate just the text portion of the atlas in a new font.

  • Another, to unpack the atlas as individual images and repack it.

Both programs are in a project on GitHub at sqwishy/civ4-atlast. They both use HTML in different ways.

atlast.html

The text portion of the atlas is output from a JavaScript program in an HTML file. It uses <input> fields to provide user options, a <canvas> to show a preview of the atlas image, and a button to save the atlas as a TGA image file onto the user’s computer.

Since it’s a single HTML file, GitHub serves it at sqwishy.github.io/civ4-atlast/atlast.html.

It isn’t configuration for a separate executable. Instead, the <input> values that Firefox preserves when you use Save Page As... is state for the program in the same HTML file. One reason this works is that the inputs are the source of truth for the program. When the JavaScript program runs, it uses the values of the inputs as parameters. And those values are saved and loaded in the HTML by Firefox.

The main reason to use a web browser to generate the text atlas was that it does a pretty good job of rendering text and offers APIs for font metrics. But, since web browsers are prolific, there are other advantages – like even just not having to deal with Windows telling users that your program is dangerous because you didn’t pay your $150 indulgence to the tech-papacy to get the executable signed.

Below is a zoomed in portion of the vanilla GameFont.tga file but with transparency removed. So every pixel is fully opaque.

zoomed-noalpha.png

Some cells appear entirely white because the transparency has been removed. The white cells in the bottom half are just empty. But, the first four rows are white text on a white transparent background – they appear fully white here because the transparency is removed for this illustration.

It is a bit weird to think of transparency as having a colour, like white transparent. Or like the pink and cyan transparent pixels in the frame between cells. Each pixel has four components: red, green, blue, alpha. The alpha channel specifies how translucent the pixel is. A pixel is transparent when the alpha channel is zero, but it may still have other values for the three other components. Nevertheless, in most programs, those transparent pixels are displayed the same regardless of the values of the three other components. So we don’t develop an intuition that transparent pixels have colour values.

Each character or icon is in a cell. Cells are separated by a pink border. In the border the right of some cells is a teal pixel that specifies the baseline, used for characters like g, j, and p that have sticky-downy bits (descenders) below the baseline. As far as I can tell, Civilization IV reads these cells in sequence to map them with whatever glyph they represent. For example, the fourth cell is a dollar sign $. If city’s name contained $, the game would use whatever image is in that fourth cell when rendering the dollar sign in the city name in the city bar shown earlier.

To summarize, the atlast.html file uses a browser canvas to generate a TGA image of the text (letters, numbers, symbols) portion of the GameFont.tga atlas based on the user’s parameters including font weight, size, family, color. Since the TGA image this generates follows the structure of the original GameFont.tga, shown above with pink frames, it can be used as input for a program designed to work with the original GameFont.tga – the next program.

atlast.exe

The second program is a command line tool for unpacking an atlas like GameFont.tga into individual images & repacking the images back into a single atlas.

It creates a manifest to store information about the atlas other than the image data of the cells themselves. This includes the sequence of cells in each row and the sequence of rows in the image.

It looks something like this:

<div data-atlas>
  <div data-atlas=row>
    <img src='000.png'>
    <img src='001.png'>
    <img src='002.png'>
    <img src='003.png' data-descent=2>
    ...
  </div>
  <div data-atlas=row>
    <img src='055.png'>
    <img src='056.png'>
    <img src='057.png'>
    <img src='058.png' data-descent=5>
    ...
  </div>
  <div data-atlas=row>
      ...
  </div>
  ...
</div>

Cells are in sequence under a data-atlas=row and those rows are in sequence under a data-atlas, the top-level object for the atlas in that document.

Each cell in the manifest, an <img> element, has a file path to the unpacked image of each cell in the src attribute. It may include the descent/baseline marker position (as data-descent=... in the file) because that is needed to accurately reconstruct the atlas and the unpacked images do not include the transparent frame where that marker occurs.

That manifest is an HTML file. Also in that HTML file is a stylesheet that makes the manifest resemble the atlas when you open it in a browser. Here is a screenshot of a manifest viewed in Firefox – the is manifest generated from the vanilla GameFont.tga.

manifest.png

Viewing the manifest in the browser doesn’t really help with editing it – except in so far as you can preview changes you make to the manifest from another program like a text editor. If you wanted to be really fancy, I’m pretty sure it’s possible to include a JavaScript program on the page to add, modify, or remove cells through DOM manipulation. In Firefox, and even Chrome, it seemed that the Save Page As… feature would changes to the DOM made from JavaScript.

Originally, I was using a TOML file for this manifest; it’s simpler and more conventional. But, I needed the program that unpacks the atlas to be able to modify an existing manifest and do a sort of in-place update instead of overwriting it. It was also important that it did not unnecessarily tamper with whitespace or comments that the user may have added to the manifest. The libraries for doing that in TOML were not especially remarkable and using them would be at least as much work as using the Rust library tl to write the manifest in HTML. And, with HTML, there is the added bonus of using the browser to get a nice preview of the manifest.

Using these two tools, it’s fairly straightforward to unpack the game’s original GameFont.tga into separate images, make your own TGA text atlas from atlast.html using your web browser, unpack it over top of the unpacked vanilla atlas, and repack the separate images into a new GameFont.tga that the game can use.

cat.jpg
cat by kernpanik

So next time you’re writing a new micro-service – or whatever it is you do bring meaning to your singular mortal journey through time – and you have to decide between INI, XML, JSON, JSONC, YAML, TOML – or whatever wacky format they think of next – consider instead using HTML for your configuration file and throwing a bloated webapp inside of it.

If you want to read more wacky Civilization-related content, see my follow-up post; Exploding the Civilization IV User Interface.