Pattern-Matching, Postprocessing, and a Buggy Bundler
This post is part 7 of the aelith-game series:
- Boy, Have I Been...
- So far ahead, yet so far behind
- One Hell of a Physics Engine
- What's in a game?
- A Fresh Start
- I Just Can't Keep My Mouth Shut
- Pattern-Matching, Postprocessing, and a Buggy Bundler
Welp, I did it again. I made another programming language. It took over two months of late nights and coffee and it somehow felt like only two weeks. And I certainly could have been writing other blog posts about it in the meantime, but I honestly didn’t think it had been that long since my previous post. I guess it goes to show you what happens when you get the seed of a good idea and dedicate all your time to making it.
Ultimately my goal is to finish my game, Aelith, but I can’t do that if I don’t have a way to implement anything. I’ve done some more thinking about what I want to be able to do with Aelith’s implementation, and first and foremost is supporting mods and addons, like Minecraft does.
Minecraft is fairly easy to mod: Java Edition is written in Java, and Java has self-introspection and monkey-patching capabilities since it’s a bytecode-compiled language - Java mods can, and do, just patch the bytecode in order to inject their functionality. Similarly, while Bedrock Edition is compiled to machine code from C++ and can’t be directly self-modifying, Mojang did bake in a Javascript add-on API to hook into various game systems and change their behavior.
In most cases I’ve heard of, Bedrock’s limited add-on API is, well, limiting, since you only have the ability to mod stuff that Mojang allows you to modify, whereas Java mods can patch basically any part of the game. I’m torn on how to implement this for a couple of reasons. Aelith is written primarily in Javascript (transpiled from Typescript) since it is designed to run in a web browser. Javascript is formally specified by TC39, and the spec doesn’t specify any kind of introspection or self-modifying capabilities. All the myriad of Javascript engines in the major browsers take advantage of this, so Javascript winds up being more like Minecraft Bedrock. Mods can only access what I allow them to.
However, there’s so many good Minecraft Java mods (even just simple cosmetic or quality-of-life mods – I use almost 70 of them) and plenty of them are just impossible to recreate on Bedrock, because they interface with systems in the game that Mojang just hasn’t yet added APIs for on Bedrock. These are the kind of mods that I want to support in Aelith - ones that add good features that I never even started to think about. I’m going to be making Aelith to mostly fit my own preferences, but I’m only one person and I’m sure that if anyone else plays it, they’ll have their own opinions and suggestions on what should be added. If it seems obvious enough when brought to my attention, I might just add it directly, but if it’s a bit more involved, it should be easy to make a mod to add it.
To that end, I’ve created Backolon. It’s a stupidly simple little programming language that is mostly designed to act like Javascript minus most of the baNaNa nonsense. It’s completely unoptimized and slow as hell, but what it lacks in speed it makes up for with a plethora of metaprogramming features - token-oriented pattern matching being the core mechanic upon which everything else is implemented.
A very difficult beast to tame
Backolon’s extreme reliance on pattern-matching makes its implementation pretty easy to understand in concept. At least on surface level. When you really look at the internals, you’ll find an extremely cursed and tangled web of interconnected modules that are so tightly bound to each other that fixing one thing will almost certainly break something else.
As an example of how cursed that pattern-matching is, I recently tried, and failed, to add in-place augmented assignment operators. An augmented assignment operator is effectively a transformation that takes x $= y and rewrites it into x = x $ y, for all binary operators $ – with the caveat that if x is a compond expression such as (expensive function)->baz, the (expensive function) is evaluated once and only its return value is indexed twice (once to access the initial value to be operated on along with y, the second to assign it).
As one might expect from other languages that have augmented assignment operators, augmented assignment always has the same precedence as normal assignment, and this is what made adding it tricky. Every syntactic pattern in Backolon has a precedence - x + y (addition) has a precedence of 4, and x = y (assignment) has a precedence of 12, so it comes after addition, which is the expected behavior. The problem arises when I add x += y to also have a precedence of 12, because in Backolon (at least currently) the + and = are actually two separate tokens, and because the alphanumeric wildcards in the patterns can match any token, this happens:
backolon:0> foo := 0
=> 0
backolon:1> foo += 1
repl://1:1:6: error: invalid name: "="
1 | foo += 1
| ^
What is going on here? Well, remember that x + y has a higher precedence than x += y. So Backolon tries to match x + y against the token sequence foo + = 1 first, and it actually matches - with x bound to the symbol foo and y bound to the operator =. (Whitespace in pattern definitions translates to matching zero or more whitespace in the actual input.) This gets rewritten into (__add x =) 1, which obviously doesn’t make much sense since = isn’t a valid variable name (as the error message says). But now that it’s rewritten, there’s no possibility of x += y ever having a chance to match. So we get that error.
This could be solved easily if I made += be its own token in the same way that foo isn’t three tokens, but when I wrote the tokenizer I had the pipe dream that I would be able to write patterns to match any operator followed by = and thus I could write one pattern for all compound assignment operators and it would “just work”. However, I forgot about >= and <= which are not compound assignment, and so precedence would be a tricky problem either way. I tried adding compound˙
Documentation is (not) easy
I eventually got fed up with adding compound operators and procrastinated fixing that by working on Backolon’s website (https://backolon.js.org). It’s quite primitive right now, and I suck at website design in general so I vibe-coded the majority of the layout and styling of the landing page and then edited in my own text. The documentation page (https://backolon.js.org/docs/) was a bit more of a challenge - I wanted the source for the documentation of Backolon’s built-in syntax, functions, and variables, to be written next to their actual implementation in Backolon’s source code, but I couldn’t figure out how to do that for a while.
I was already using Typedoc for documenting Backolon’s Javascript API, and I found it has a JSON output. I figured that I would just be able to add some custom tags, write the Backolon documentation in the JSDoc comments near the implementation, postprocess the JSON, and then load the postprocessing output along with a little script that displays it in HTML.
As I should have foreseen, it wasn’t that easy. The first problem I encountered is that by default, Typedoc only keeps documentation comments that are attached to something that would make sense to document in a Javascript package - a variable export, a function declaration, an enum, a class, etc. The Backolon builtins are registered using a function call - and so Typedoc completely ignores those comments. I had to write a fairly large Typedoc plugin to re-walk the files’ ASTs and extract all of the comments, and then create dummy exports in order to attach them to something. And to add to the plugin’s cursedness, I even had to directly import Typedoc’s internals from my node_modules folder (usually a no-no) in order to parse the comment. Typedoc didn’t export those functions since they didn’t ever expect Typedoc to be used (or abused) the way I am doing.
And that wasn’t even the end of it. The @param tag - which is used to document a parameter to a function - can accept a type annotation after it in curly brackets, and I decided this was nifty and I would use it for the Backolon doc comments as well. Except once I convinced Typedoc to parse those comments and save them in the JSON, I was suddenly finding the type annotations on the @param tag’s objects to be undefined. Not nonexistent - I console.log()‘ed the comment objects, and it clearly printed typeAnnotation: undefined. Why?
Well, it wasn’t exactly clear from Typedoc’s documentation, so I started digging around in Typedoc’s source code to find out, and it led me to this. It turned out I had to add @param to a whitelist of tags to have their type annotations preserved in the JSON. To be fair, I should have expected this. Typedoc is intended for Typescript codebases, where the type of the parameter is already present in the function declaration itself, and so a type annotation in the comment is completely unnecessary. However, with the way I am abusing Typedoc, there’s no function declaration with types to attach the comments to!
Once I figured that out, it was pretty easy to put together the rest of the documentation page. After I realized that my Javascript that constructed the HTML for the page was literally just reading in a static JSON file and producing the same HTML each time, I moved that generation to the build step. I was even able to get rid of my stupid micro-framework vanilla for creating the HTML elements in the webpage - since it was happening (essentially) server-side now, I only needed to concatenate some strings (that just happen to be well-formed XML by the end of it) for the entirety of the documentation!
Did I do that?
Since the bundler I was using to build everything (ESBuild) couldn’t bundle directly from HTML files, or at least would have required a plugin and thus more technical debt to do so, I switched to Bun. Bun can handle HTML as an entrypoint, but otherwise tries its best to be a drop-in ESBuild replacement. For the most part, that was true. But Murphy’s law struck again, and I discovered not one, not two, but three fairly major bugs in Bun. Not in Backolon. In Bun. And one of them even involved Bun segfaulting.
-
When Backolon starts up, it needs to load and execute the builtins file that contains the builtins defined in Backolon itself. In the compiled and minified bundle, it does this by embedding the pre-parsed AST in a heavily compressed and serialized format, and then resurrecting it and running it from there.
This is handled by a Bun bundler plugin, but when running the unit tests, Bun doesn’t know anything about that plugin and can’t load it. So I use a build-time constant to dead-code eliminate one of two branches - in the branch that activates in the bundle build, I just import the file (which activates the plugin) and I get the AST straight away; in the test-only branch, I read the file’s contents explicitly, using
await Bun.file(...).text(), and then parse it. As you might notice, that involves a top-levelawait.ES6 modules actually have no problem with top-level
await, but I am using jsfuzz to fuzz-test Backolon, and jsfuzz only accepts files to be fuzzed in CommonJS format, due to limitations in how it instruments the code to determine coverage (which it tries to maximize). CommonJS does not allow top-levelawait, but ESBuild had no problem dead-code eliminating it when I bundled Backolon in CommonJS format for the fuzzer. Bun refused to do that, despite its claims to be ESBuild compatible, and so I opened this bug and then added yet another cursed hack to work around it. -
Issue #29264: Bun segfaults if there’s both nonexistent and external modules in a build
This was mostly an accidental discovery when I shuffled some files around in the website folder after I initially got Bun working. VSCode tries its best to automatically update imports so that they still point to the same files after you reorganize, but if you start moving too many files around too fast like I was doing, it can’t keep up, and I wound up with an import that now pointed nowhere. I would have expected Bun to flag this error, but… it just crashed. And since there’s so many interconnected components in my cursed mess of build scripts for Backolon’s website, it was quite difficult to isolate the problem.
I tracked it down to a combination of the broken import and this kludge I used to forcibly prevent jquery.terminal from importing jQuery twice by marking jquery.terminal’s import as external. When the two things (nonexistent + external) combined, Bun got all mixed up and tried processing one after it had been deallocated and crashed.
I guess this one was fairly important, because of these three, it’s the only one that has had its associated PR merged so far.
-
This one I found totally by accident. I was absentmindedly admiring the minified code output and then scrolled down to the part where I defined the core syntax of Backolon and found two absurdly long strings of 100 zeros that I didn’t write. Well, it turns out I did - just not in that form. I just arbitrarily set the precedence of lambdas and implicit blocks at ∓1 googol to give myself some headroom to insert other patterns in the precedence order, but even I was able to write them in the shortest format -
1e100instead of1. ESBuild can do that that too. But no, Bun wants it to be exact…0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Maybe I should have expected all of these problems, too, because Bun is almost entirely vibe coded now.. And not just from a deluge of AI bros and OpenClaws eagerly wanting to fix bugs in Bun - no, Bun is now maintained by a vibe coding company. Great job ruining a good dev tool, guys…
At least I have a goal
Backolon is still in its fledgling stage, and while I’m going to be using it in Aelith, I still kind of want Backolon to be used elsewhere, by other people. I can’t make anyone do that, though.
What I can do is to make Backolon accessible to everyone. That’s why I made a website for Backolon where you can easily check the documentation (however terrible it is) and try out snippets in the online REPL. For discussing Backolon, I obtained a channel for it in the proglangdesign.net Discord server (which you may have to join explicitly before the channel link will work). There’s also the Github repository for filing bug reports and suggestions.
And it goes without saying that Backolon will probably end up highly tailored to be used in Aelith, since that’s its primary use case right now. Aelith also has a goal and game plan in mind, just like Backolon does, but they kind of depend on each other. So it wouldn’t be out of the question for Backolon discussion to occur in the main Aelith Discord server - if enough discussion occurs, I might create a dedicated channel for Backolon there as well. And if you’re just interested in Aelith itself, there’s never been a better time to join the server.
Aelith sits in a strange place in terms of software projects. It’s yet another thing that I’ve gotten myself into, but it’s something that I am not the only one interested in. All of my other projects are mostly tools that I can’t find any use for myself, but Aelith is a game, and it’s automatically useful because anyone can play it. That motivation has kept me working on it for close to two years now, and I hope I can continue to work on it for many more years into the future.
Related Posts
- Continuations and the thunk queue
- Pickle Tokenizer
- One Hell of a Physics Engine
- So far ahead, yet so far behind
- Boy, Have I Been...
