Positional Audio in Grounded via Mumble

Sat May 25 '24

Mumble is free and open source voice chat software. It’s appealing for video gaming in particular because it has low overhead, low latency, and good quality audio for speech. It also has a feature called positional audio, which makes it possible for people in the same call to sound like they’re in different places around you. So if you’re wearing headphones, or have some kind of multi-channel audio system, one friend might be louder to your left side and another to your right. If you connect that to whatever computer game you’re playing together, it’s a very cool effect where your friend’s voices are coming from their characters in the game.

Last year I played a game called Grounded with a friend. It’s a Honey, I Shrunk the Kids game where you play as a bug-sized kid, lost in a backyard, trying to figure out how you got there and how to unshrink yourself.

In a test of my friend’s patience, I spent a good chunk of time in-game – time meant for playing together – trying to write a small program that would link the game to Mumble. This way, while we were in our voice chat, our voices would sound as if coming from our characters in game. If my friend was in front of me, they would sound like it. If I turned around, they’d sound behind me. If they ran off to go fight bugs, leaving me to my own diversions, they’d sound further away.

To do this, Mumble needs to know where you are in the game world, where you’re facing, and where everyone else is. It has plugins that load up when a game starts and will read the game memory to get that information. But there isn’t a plugin for Grounded. Alternatively, it has a Link Plugin for games with built-in support for Mumble to connect to Mumble on their own and send this information.

Mumble has a good bit of documentation for its Link Plugin. So I started from that and wrote a super simple program that would read memory from Grounded and send the player’s positional information to Mumble over this Link Plugin. The program only has to read information about the current player. Mumble handles everything else by transmitting each player’s own position along with their voice.

So when my friend talks, the program running on his computer has told Mumble where he is in the game world. On my computer, Mumble receives his voice and position in the game world and makes that audio positional by using my position & direction in-game from the program running on my computer. Both positions are using the same coordinate system and we’re both in the same world, so even though his was collected on his machine and sent to mine, the numbers match up and everything works. So both of us need to be running the program for it to function; but the program is fairly simple because Mumble is doing most of the work.

The Link Plugin documentation I linked to earlier even has an example of how to open a pipe to Mumble and share memory with it to send positional data. The tricky part is just figuring out how to read your player’s positional data from the game. Going into this, that was the part in which I had the least confidence that I’d be able to write a program to read reliably. It turned out that an old program called Cheat Engine had a feature called Pointer Scans that was perfect for this.

Cheat Engine

Remember Cheat Engine? That program that you tried when you a teenager that let you scan a game when it was running to find where in memory it stored how much money you had so you could modify it and give yourself infinite money. After that, you forgot about it because it was a bit of work and cheating isn’t that much fun actually[1]. But, using it to find find your player’s position in the game world is similar to an infinite money cheat.

To write a program to reliably read your player’s positional data from the game, the program needs to know the location in memory of that positional data, called a memory address. We can find the address manually the first time; but, after the game restarts, the data will have a new address.

Instead of having a consistent address for your player’s location, the game works by having a consistent address for another structure that has the address for your player’s location. Or, with additional indirection, has a consistent address to a structure that points to another structure that points to player’s location. It’s a treasure hunt where you aren’t given the directions up-front – you are given directions to find your next directions.

So it’s not good enough just to find the address of the positional data once. We need to back-track and figure out what structures are pointing to the positional data and what structures are pointing to them and what have addresses consistent through restarting the game. That way, we can write a program that reads those structures in sequence and follows them like a treasure hunt.

Pointer Scan

A Pointer Scan in Cheat Engine is a feature that lets you search a game’s memory for things that look like they might be pointers to an address that you care about. Because of how programs work and how they use computer memory, you might not find an address that points exactly to the information you want. Instead, you find an address to a sequence of related values close together forming a structure that includes the information you want.

For example, our player’s height in the world (altitude) is probably in a structure along with their position along the north-south and east-west axis; a structure called a three-dimensional vector. Or that vector might be in a structure along with other information about the player or their camera. It could be gameplay information like your character’s health and energy. Or camera information like the field-of-view or a front and top vector that can be used to determine where the camera is facing and its pitch & roll.

But instead of the game storing pointers to each of these values, the values are packed together in a structure and that structure is pointed to instead. When the game wants to use one of the members in this structure, it uses the structure’s address along with an offset for where that packed value is relative to the start of the structure. Like in a treasure hunt, the address might be “go the big oak tree in the middle of bumblyberg park” and the offset is “and then keep walking twenty-four paces to find your next clue”.

What’s fancy about the Pointer Scan in Cheat Engine is that it doesn’t just find values pointing directly to the memory you care about, but also memory pointing nearby to what you care about in case the memory you care about is offset inside a structure. And then it does this recursively a few times. So it finds stuff that might point to the thing you care about, and then things that point to that, and more things that point to that, and so on. A lookup or paper trail of some sort.

You can end up with a lot of false positives. I think the first time I did this it took several minutes to run the scan and ended up with hundreds of thousands of pointer paths. The goal is to find one that is stable across game restarts.

When running a pointer scan, Cheat Engine lets you cross-reference a previous scan. So the trick here is to restart the game, manually hunt down the new address for the value you care about (your player’s position or whatever), and rerun the scan to look for pointer paths to that new address while cross-referencing your last scan.

This is a screenshot of my final pointer scan after doing this several times.

pointer-scan.png

The selected row near the middle is what I ultimately used. It reads:

Base Address

“Maine-WinGDK-Shipping.exe”+059290F8

Offset 0

0

Offset 1

8

Offset 2

798

Points To

284B8A747C8

The offsets and addresses are given in hexadecimal, base 16.

  • The first step means, start from the game’s base module address and add an offset of 0x059290F8 bytes to it to get the address to the next pointer.

  • Reading the address there gives us the next address because the offset is zero.

  • That address points to a structure containing our next address at an offset of eight bytes past the start of the structure.

  • The address there then points to a larger structure where our value is 1,944 (0x798) bytes offset from the start of the structure.

In this example, the address that the pointer scan was looking for was the camera’s height. The eighth byte into the camera’s position. It happens that the camera’s front and top vectors are 144 (0x90) bytes before the position. So my program reads it all in one big read starting from the top vector, an offset of 0x700. It looks about like this:

#define ptrOffset(base, offset) (LPVOID)((INT64)base + offset)

struct GroundedCam cam;

LPVOID p = baseAddress;

if (   !ReadProcessMemory(proc, ptrOffset(p, 0x059290F8), (&p), sizeof(p), NULL)
    || !ReadProcessMemory(proc, ptrOffset(p, 0x0), (&p), sizeof(p), NULL)
    || !ReadProcessMemory(proc, ptrOffset(p, 0x8), (&p), sizeof(p), NULL)
    || !ReadProcessMemory(proc, ptrOffset(p, 0x700), (&cam), sizeof(cam), NULL))
{
    /* couldn't read memory, do something to handle this somehow */
}

ReadProcessMemory is a Windows system call to read another process’s memory using a handle to the running process. If it can’t read the memory it returns false. This can happen after the game has launched but before the player loads into a the game world; while they’re still at the main menu. When any ReadProcessMemory in that sequence returns false, the branch is taken and the program can handle that case by waiting for some time and trying again later.

The last call to ReadProcessMemory copies a chunk of memory from the game into a data structure that looks like:

struct GroundedCam
{
    /* 000 */ struct f3 top;   char _unused0[4];
    /* 010 */ struct f3 front; char _unused1[4];
    /* 020 */ char _unused2[16 * 7];
    /* 090 */ struct f3 pos;
};

The type struct f3 is a vector of three floats, x y z. Each float is four bytes. So a struct f3 is twelve bytes.

To make it easier for readers to map the structure to its memory layout in their head, the struct members are placed on rows of sixteen bytes each. And offsets from the start of the struct are written in hexidecimal in comments before each line. But, since seven of the lines between the front-facing and position vectors don’t have information we care about, we omit most of the rows.

So that’s really all it is. The rest of the program is some busy-work getting a handle to the game’s process and the base module address; which is stuff that mumble has convenience functions for if you use their API to write a proper plugin. It’s just one C file; public on GitHub.

The part I wanted to share the most here was the pointer scan feature of Cheat Engine. I was surprised that it had this function. The user interface was a bit clunky and, while using it, I wasn’t entirely certain that it was doing what I thought it was doing or that I was using it correctly. But there are a few tutorial videos people have made and posted on YouTube that I found helpful.

I started from mostly guesswork with this – having hundreds of thousands of candidate pointer paths. And, through disturbing the game state enough, you can filter it down to a selection of paths that seem like what the program itself might be using as pointer lookups.

It’s kind of a reverse engineering but without actually requiring any reading or analysis of disassembly. So it’s great for people like me who are bad at reverse engineering.

Finding the address manually the first time was a bit tricky because I didn’t really know what the coordinate system was like. I assumed that there was some value corresponding to the player’s altitude, so it increases if you go toward the sky, and that it was a positive number if the player was fairly high up, well above the water level. Cheat Engine lets you scan for positive numbers and then filter them based on how they’ve changed since the last scan. So if I went down stairs, I can scan for numbers that decreased. If I go up stairs, scan for numbers that increased. This was not the most exciting part.

After finding the address for the player’s position the first time, I could find a landmark in the world and record my position when I’m there. Next time I restart the game, I would go to that landmark and scan for memory that matched the recorded position to find the new address for my player’s position in the game. So finding the new address after restarting the game was easier than it was the first time.

fox.jpg
Anyway, here’s a photo of a pear from Otwarte Klatki on flickr.

I used to regularly play a military-simulation first-person shooter game called Arma 3 and there were two big mods (unofficial user-created game modifications) that added this kind of positional audio feature, ACRE and TFAR. Though, both used a voice chat software called Teamspeak, instead of Mumble.

Arma 3 has a larger scale that most other games – bigger maps, more players, and more attention to realism and detail. And, importantly, more complex structures that people organize themselves into. So these mods also implemented some interesting features and complexity.

  • Multiple audio channels, so you can break out into your own “chat room” with your squad.

  • Audio distortion at distance when communicating over a short wave radio. Or from terrain, like mountains.

  • Double the fun by wearing two radios with one in your left ear and another in your right. So you can listen on two different channels at once.

Or just talk without a radio and it sounds like your voice is coming from your character in the game. And only people nearby can hear you.

Something I like, especially in cooperative multiplayer games, is how groups can work together to become more than the sum of their parts. And how communication can make or break that chemistry. Whether it’s as simple as a machine gunner paired with a spotter who can identify targets and give firing corrections, or it’s communicating intel and coordinating dozens of players between squads in different channels across a command net. It’s fun stuff and I think positional audio is pretty cool.