I keep telling people I’ll write up articles about the random crap I end up coding in my free-time, so I went ahead and made a Medium account, probably just one step below being one of those weird people who write LinkedIn articles (I might one day become one of those).
Ended up wasting 20 minutes trying to style my Medium page to look cool, only to hit the revert button and now here I am.
ANYWAY: This article will be a rambling story time with no real lessons to be learned.
A while ago I got into Game Boy Advance (GBA) homebrew development nonsense via the popular devkitARM toolchain, maintained by WinterMute — a hero that has almost singlehandedly kept the homebrew community for these retro consoles alive and well in the modern age.
To cut a long story short: I made an alternative to the devkitARM toolchain which I cleverly named gba-toolchain. It’s all CMake based, uses the official ARM embedded GNU toolchain, pulls in optional libraries, GBA focused & optimised, licensed to not kill commercial products, yada-yada all the good stuff. Biggest hurdle right now is getting people in the GBA homebrew community to care about it.
Hey this is cool, Medium lets you search for Unsplash images and throw them right into articles with credits and everything. Aren’t many relevant images, so here’s a Raspberry Pi which, like the GBA, uses an ARM processor.
What the CPU Context Switching
Time for a crash course on context switching: the GBA has one CPU core (teeeechnically it also has a LR35902 core on the die, but it’s totally inaccessible in GBA mode, so forget that). Modern software tends to be multithreaded these days (…maybe not so much Minecraft Java Edition), which in the 90s/early 2000s meant you had one CPU core switching between different “contexts” (threads?) of execution to attempt some multitasking behaviour.
How is this done on a single core processor? Simple: save the current state of the CPU (registers, stack pointer, program counter, link register) all to memory, then load up the state for the other thread into the CPU and let it continue execution as if nothing ever happened.
Why the CPU Context Switching
My investigation into this came from a misconception about the way the Pokemon GBA games worked. Some random person on the internet claimed Pokemon GBA uses a clever context switching multi-tasking thingy-mo-bob business for the game event scripting (NPC behaviour).
I come from a background in RPG Maker game development, the event system the RPG Maker engines all use is super simple, so I was like “oh Pokemon does something special? I should look into this it sounds super cool”. I was expecting to learn something neat, but unfortunately the random internet person was making up rubbish; Pokemon GBA’s event system is very similar to RPG Maker. (
pret/pokeruby/script.c RunScriptCommand for anyone in-the-know).
This disappointment turned into a few Discord discussions about how to do game event scripting, and at some point someone mentioned coroutines being awesome for modern game scripting. I was familiar with coroutines from Unity C# projects, but never used them with C/Cᐩᐩ.
If I remember correctly, someone demonstrated coroutines on the GBA in some other programming language that had native support and we all did the usual GBA homebrew community thing of saying “nice” and then pretending we didn’t see that demo for the next 6 months whilst we secretly wish we wrote it in whatever programming language we’re fans of.
Ultimately, the technique described let you write game event scripts in the same compiled language that the rest of the engine is written in. A bit like how Unity C# developers script all their game in the same C# environment that they write all their technical code in.
Mark Papadakis (oooh Medium tagging) wrote an article back in 2016 about coroutines: https://medium.com/software-development-2/coroutines-and-fibers-why-and-when-5798f08464fd
I read about how coroutines work, which turned into reading about context switching, because the important part of coroutines is being able to bail out of a function part-way-through and return to that point later; to do that you need to save the current context, switch to the target context, and continue execution as if nothing happened.
Here’s a photo of a bunny rabbit.
Do the CPU Context Switching
The goal is coroutines in Cᐩᐩ on the GBA. The road to that goal is context switching.
During all this I’m exploring Cᐩᐩ20 and I see that it has a coroutine library built in. I also see that the folks over at Boost were publicly crying on the internet because their coroutine solution wasn’t the one being chosen. I thought: “eh, if Boost did it without language support, then I can probably do it too”.
After a lot of digging and reading I found out: if you’re on a *NIX platform you get the context saving/restoring as part of the kernel API: man7.org/linux/man-pages/man3/getcontext and you can even build up a new context with: man7.org/linux/man-pages/man3/makecontext
Alright, but the GBA is not a *NIX system. We don’t know this. The toolchain doesn’t make it *NIX compliant either.
Then a couple days later my ‘gba-’ rival JoaoBaptMG adds context switchingto their gba-modern project https://github.com/JoaoBaptMG/gba-modern and once again I’m all upset because he beat me to it. I felt like I just took that step onto Nugget Bridge only for the familiar theme music of JoaoBapt to start playing as the dude struts towards me with his swanky code and a stupid Pidgeotto.
I did what all the other honourable members of the GBA homebrew community would do in this situation: I sent him a little message of “nice” on Discord, then proceeded to pretend I didn’t see his work whilst I hastily tried to code it myself on my own toolchain.
I wanted my solution to be as familiar to incoming developers as possible, so I specifically modelled everything I did on the *NIX functions:
getcontext/setcontext/makecontext/swapcontext which I believe was the right choice, despite some of the extra memory bloat the context saving structures for these functions require.
Originally I wanted it to be super flexible, I wanted programmers to use these functions for just inspecting the state of their program, but I had to drop that plan because the GBA doesn’t let you
pop stack memory into the program counter; I had to waste a register for use as a temporary value to store the program counter before loading it.
r12 is good to trash (according to the aapcs), so I sucked in my pride and with tears in my eyes continued to write my code to be 1 register less than perfect.
Look at all that wasted memory.
Actually, after a lot of fiddling, crashing, bugs, etc the end result of my getcontext/setcontext ended up being incredibly simple.
__agbabi_ prefix is there to let it sit among some of the internal
__aeabi_ functions without standing out too much. This is all toolchain level code, so no need to worry about reserved underscore naming.
Actually now that I look at them again; I’m shocked at how tiny these functions are compared to gba-modern.
Okay I cheated a bit: gba-modern does a boat load of common API helper logic because to actually make these context gets/sets useful you need to make a context with a target function and swap to that (also, gba-modern successfully implements IRQ context switching, which I’m still green with envy over).
Most of this function is based on glibc’s ARM implementation of makecontext but with some fluff omitted and tweaks so it would work with whatever the heck I was hacking together. Wtf is
You can see in my
makecontext implementation that I save the link context (context to return to on exit) into
r5 , so r5 is pushed to the stack before the context is entered, popped back when the context returns, and then
setcontext is called with r5. Unless r5 is zero, then uh, I guess we got lost so
_exit is called
r4 contains the actual address of the entering function, hence the
bx r4 (branching into address stored in r4).
There’s enough here to implement fibers, but some extra boilerplate code helps for co-routines.
swapcontext both saves the current context and enters the target context.
The only thing special about this is that the stack pointer and link register is manually recovered from the callee state, rather than whatever the state is inside getcontext.
I got something like the following as the sample on the toolchain’s wiki page:
And for my final trick…Coroutines
The coroutine section is all under my gba-plusplus project: goal for this project is to implement a Cᐩᐩ17 header library for the GBA (an alternative to libgba). As usual; almost no-one cares.
__agb_abi is detected (macro defined, which suggests you linked the agbabi) then coroutines is available as an extension.
Here’s a fibonacci example that’s been adapted from a Boost sample or somewhere (I don’t remember). The entire thing is inspired by Boost coroutines, which wants you more than anything to use them in a pull/push consumer/producer paradigm.
I love the little detail of naming the
coroutine<>::push_type “yield” so the code can pretend to be all fancy with keywords.
From here the idea is to make your game environment use some kind of event context as the coroutine push/pull type and then you can do something similar to Unity with the whole ‘return an enumerator that expects to wait N seconds’. Not far off from the ‘wait_until’ thing that quite a few coroutine/fiber implementations have.
I messaged JoaoBapt to inform him I’m slapping his name in this article, he kindly linked me to his original implementation of context switching from his repository here: github.com/roboime/BattleBot-STM32 so turns out the guy had a 2 year head start on me regarding this stuff. I am shaking my fist in the air right now. He is always one step ahead of me…
I guess that’s the end of my stupid fucking story. I’ll probably write another one in a year’s time or something. Maybe I’ll interview Byron about his weird obsession with entity component system architecture and write my thoughts on that.