Scripting a Spelunky 2 Speedrun | Part 3: Do You Want to Build a Language?

With an input simulation proof of concept working, I needed to think about how I was going to use it. Do I create a simple library, or do I go the extra mile(s)?

Scripting a Spelunky 2 Speedrun | Part 3: Do You Want to Build a Language?

The fastest way to get to something workable would have been to create a library with a bunch of utility functions that I could call from a Go program. But my goal was bigger than just me being able to script a Spelunky 2 run. Ideally I wanted to create a tool that would allow anyone with very little programming knowledge to script their own runs, and not just for Spelunky 2, but for any game! In order to achieve this, I had an idea: a custom scripting language. And thus, keyScripter was born.

In this post we're going to go into the design of the keyScripter, and have a look at how the compiler works. I've tried to keep everything fairly high-level, so as not to make the post too long. If you want to dig deeper, be sure to check out the keyScripter repo on GitHub!

Language Design

While designing a language it's important to think about what you want to achieve with it, and thus what it should (and shouldn't) be able to do. For the keyScripter its only goal is to send simulated inputs to the OS, so most of the functionality is designed around making that as straightforward as possible. Since it's meant for tool-assisted speedruns, timing is also very important. It's important to note that since the focus of the language is fairly specific, it doesn't have to be general-purpose. This is also why you'll see it doesn't have any loops or conditional statements - though these might be added in the future if there's a need for them. So what does it do?

Basically the whole language revolves around 2 concepts: calling built-in functions, and what I call "timestamp blocks". Besides this there are also some parts of the language less focused on functionality, and more on making the code easier to read and write, like defining variables and custom functions, and adding comments. Finally there's the ever-argued question: static type checking or not? Let's go into all of these concepts one by one.

Built-in Functions

Most functionality of the keyScripter comes from functions that are built into the language, which allow you to send various kinds of inputs to Windows, as seen in part 2 of this series. Besides that there are some functions to print some text, or wait a certain amount of time. We'll quickly go over the functions I used for the initial implementation.

vKeyDown, vKeyUp, and vKeyPress send virtual key events to Windows. They respectively send key down, up, and down followed by up (i.e. press) events. sleep waits a number of milliseconds before resuming the script. print outputs the given argument to stdout. With those functions in mind, we'll have a look at some examples.

A function call looks very similar to how you would run a program from the command line. The first word is the name of the function, then a whitespace, and then all arguments (separated by whitespaces). Since it only accepts a single function call per line, we don't need semicolons, brackets, or commas.

print "Pressing shift and a"
sleep 1000
vKeyDown 0x10
vKeyPress 0x41
vKeyUp 0x10
An example of calling functions.

Another important difference with other languages is that functions never return anything, and they don't have side effects, meaning everything is a pure function. While this may seem restrictive, it helps keep the language simple and easy to understand.

Timestamp Blocks

Soon after running some initial tests with the language, I realised that most of the events would need to be timed fairly precisely - at least to within one frame in game time. Using sleep calls does work, but it also means that changing the timing earlier in the script would affect all other timings. I wanted to have a way to fire events on specific times, without relying on previous sleep calls, so I added timestamps. Within a timestamps block, each line starts with a number: the amount of milliseconds since the start of the block. The rest of the line is a function call, which is to be executed at the given time.

timestamps {
    0       print "Run starting in\n3..."
    1000    print "2..."
    2000    print "1..."
    3000    print "Start!"
}
An example timestamp block.

The above example is functionally the same as:

print "Run starting in\n3..."
sleep 1000
print "2..."
sleep 1000
print "1..."
sleep 1000
print "Start!"
A functionally similar script, without using a timestamp block.

As you can see, the timestamp notation also helps reduce the amount of code required for the same functionality. Nice!

It's good to keep in mind that it's still not a perfect solution though. In a speedrun events that are done earlier will always have an influence on the timing of later events. But at least with the timestamps it's possible to slightly change the timing of events without having to change all subsequent code.

Reusability and Readability

The first thing that comes to mind when trying to make code more readable is comments. These are thrown away by the compiler, but help any humans reading the code to understand what's going on.

# Text after a "#" will be ignored, until the end of the line.
# This allows you to describe what the code is doing.
An example of comments.

Next, there's variables. These allow you to give certain values a name, like a for the number value 65 (0x41, the virtual key code for the letter a). These variables can then be used when calling functions, making the code way easier to understand because you don't have to know what the specific values used are. I can just say that the variable a is 65, and forget all about the virtual key codes.

a = 0x41
b = 0x42

vKeyPress a
vKeyPress b
An example of variables.

Custom functions weren't actually part of the first iteration of the language, but pretty soon after I started experimenting with it I realised they would come in very handy. They allow you to encapsulate bits of logic into a block of code, that's assigned to a variable. The syntax is similar to a lambda function as you would see in JavaScript, just without the arrow (=>) and commas. Calling a custom function works the same as for built-in functions.

vKeyHold = (key hold) {
    vKeyDown key
    sleep hold
    vKeyUp key
}

vKeyHold 0x41 1000
An example of a custom function.

An important thing to note here is that, just like built-in functions, custom functions are always pure. They can't return anything, or modify variables outside of their own scope. For example:

outer = "outer scope"
print outer

setVar = () {
    outer = "function scope!"
    print outer
}
setVar

print outer
An example of the function scope.

This will print:

outer scope
function scope!
outer scope
The output of the function scope example.

The assignment to outer doesn't get applied to the scope outside of the function. What's actually happening is that the variable get shadowed inside of the function.

Types

As you may have noticed in the examples above, there were no special keywords for declaring variables or type information. From that you might have guessed that the language would be dynamically typed, but you'd be wrong! I did want to have a type checking system, because having a runtime error occur because of a mistmatched type halfway through a script that's been running for a while is one of the most annoying things I could think of. On the other hand, I didn't like the idea of having to declare variable and parameter types everywhere either. So the solution was straightforward enough: type inference!

A duck typing, not to be confused with duck typing.

Type inference means that the type of an expression is automatically detected. In the keyScripter this can be either because it's assigned a certain value, or because it's passed to a function that expects a certain type. Let's look at some examples:

# strVal is inferred to be a string, because it's assigned a string value.
strVal = "string literal"

myFunc = (argOne) {
    # sleep expects a number, so argOne is inferred to be a number.
    sleep argOne
}

# This is not valid, because myFunc expects a number but strVal is a string.
myFunc strVal
An example of type inference.

Such a type inference system is definitely more work to implement in the compiler, but I feel it pays off in not having to declare types in the script. It also means you'll get an error on compile time - rather than halfway through the script - if you're using a variable in a way you're not supposed to.

The Full Picture

So with the core concepts in mind, here's an example of a sample script that uses all of those features:

# Define some virtual key codes.
shift = 0x10
a     = 0x41

# Hold a modifier while pressing a key.
modPress = (mod key) {
    vKeyDown mod
    vKeyPress key
    vKeyUp mod
}

# Count down for 3 seconds.
timestamps {
    0    print "3..."
    1000 print "2..."
    2000 print "1..."
    3000 print "Starting!"
}

# Press shift+a.
modPress shift a
A sample keyScripter script that uses all of the language's features.

That looks pretty neat if you ask me! Next, let's take a look at how a compiler actually works.

Anatomy of a Compiler

Before going in-depth on how a compiler works it's important to note that various compilers can have differences in how they work under the hood, but overall the high-level concepts are generally the same.

A high-level overview of the process a compiler goes through.

You start with a plain text code file as the input for a compiler. First this is turned into a list of tokens, which are then turned into a syntax tree. This part of the process is roughly the same for most compilers, but after you have a syntax tree it differs a bit more. Compiled languages (like Go or C) will turn this tree into machine code, interpreted languages (like JavaScript or Python) execute instructions directly, and transpiled languages (like TypeScript) will output code for a different language.

We'll go through each part of this process step by step, and break down how it works in keyScripter.

Turning Text into Tokens

The first step when compiling code is what's called a lexical analysis (or tokenization). This is the process of turning a string of text into a list of tokens, and is done by a lexer (also called a tokenizer).

The lexer is a dumb processor, in that it's only aware of what it's looking at right now. It's not aware of any context around the current bit of text. Unless it's given an invalid character, it'll happily continue to churn out tokens, regardless of if they're valid in that position in the source code.

For example, the following keyScripter script:

a = 0x41
vKeyPress a
print "Done!"
An example script for the lexer.

would be turned into the following tokens:

identifier    (a)
assign        (=)
literalHex    (0x41)
newline       (\n)
identifier    (vKeyPress)
identifier    (a)
newline       (\n)
identifier    (print)
literalString ("Done!")
An example token list output from the lexer.

Note how the lexer also doesn't know the difference between a function or a variable. It can only tell whether something is a valid identifier or not.

Parsing the Tokens

After we get our list of tokens from the lexer, it's the parser's turn. The parser analyzes the list of tokens, and checks to see if everything actually makes sense, turning it into a syntax tree. In keyScripter the tree is less of a tree, and more of a list of instructions, but the concept is the same.

The parser is more complex than the lexer, because it needs to match certain patterns of tokens to valid patterns in the language. For example, a line starting with an identifier can be a variable assignment or a function call. If it's an assignment, the value can be an integer or string literal, a function, or another variable. During each of those steps we also need to do a bunch of error checking and handling, because the last thing we want is useless error messages!

The tokens in the example above would be turned into the following instructions (note that this is a simplified view):

assignment("a", 65)
functionCall("vKeyPress", [variable("a")])
functionCall("print", ["Done!"])
A simplified view of an example instruction list output from the parser.

Semantic Analysis

After we have a syntax tree, the next step is semantic analysis. This is a process of checking whether the entire tree actually makes sense. For example: are all variables we're referencing actually defined, and do all types from values passed to function calls match the parameter types?

In the keyScripter this is not actually a separate process, but it's also done by the parser.

Executing the Instructions

Since keyScripter scripts are interpreted we don't have to worry about generating machine code or other source code. So after we tokenized our input, parsed the tokens, and analysed the tree, we're ready to execute the script! In keyScripter this is fairly straightforward, because all instructions adhere to a certain interface. We can simply call Execute() on each instruction, one by one.

Hold on, Is It Really That Simple?

Alright, you got me. The answer to that is a bit of yes and no. On a high level, yes, it really is that simple. The process of a compiler is not actually that complicated. I did leave a lot of details out of this post though, and in reality I spent a good amount of hours trying to get everything to work nicely.

Similarly the language design went through quite a few iterations, trying to find a nice balance between making it easy to write (and read) and making it easy to parse, while also adding functionality left and right.

The complexity of a compiler also greatly depends on the complexity of the language, especially in the parsing and semantic analysis. Since the keyScripter language is fairly straightforward and not made as a general purpose language, writing the compiler was also relatively easy. But the more features a language gets, the more complex the compiler will become.

The keyScripter in Action

Recall the end of the previous post, where I mentioned I was able to put down a bomb in Spelunky? Well, thanks to the keyScripter, I'm now able to do that in just a couple of lines of code. Note that here we use scKey functions (instead of vKey), because Spelunky 2 uses scan codes as input, not virtual key codes.

sleep 3000
scKeyDown 208
sleep 100 # Wait for the duck animation.
scKeyDown 46
sleep 20 # Ensure the bomb input is registered.
scKeyUp 46
scKeyUp 208
The keyScripter script for putting down a bomb in Spelunky.
I can now make a script to show me the game over screen. Worth it.

You Made It!

If you made it this far into the post, congrats! I hope it wasn't too dense. If you're interested in looking at the keyScripter (be it for the compiler, to use it for scripting, or just for fun), it's fully open source on GitHub:

LucaScorpion/keyScripter
Script keyboard and mouse inputs for Windows. Contribute to LucaScorpion/keyScripter development by creating an account on GitHub.

I guess I should also really answer the question: was it worth my time? This is a tough one, because objectively it cost me way more time to write the entire compiler than it would've cost me to just make an easy-to-use go library for simulating input. However, I do like that I now don't necessarily need the Go compiler to run a script. Besides, the initial goal was to make scripting these kinds of runs more accessible (even if only slightly), which I do really think I accomplished.

With that, I'll leave you with an XKCD comic I stumbled upon:

Source: https://xkcd.com/2309

Now that I know more about how compilers work, and I've succesfully written one, I feel it's a slipperly slope to creating silly esoteric languages...

In the next post we'll dive back into Spelunky 2, my plans for actually scripting a speedrun, and the progress so far!