An Introduction to Low-Latency Scripting for Game Engines, Part 2

fwsGonzo
11 min readMay 22, 2024

--

Advanced examples, guest allocations, RPC

In the previous part we were shown how to set everything up and make basic calls both in and out of the sandbox. By using the gamedev example from the libriscv repos examples folder, we could immediately run the presented code examples, given that a RISC-V compiler was found.

In this part 2 we will go through some advanced examples.

1. Advanced API design and the host-controlled heap

The guest heap is fully controlled from the outside of the sandbox. The guest allocates and frees memory by executing known system calls, which handle allocation and freeing respectively. There’s also calloc and realloc support.

We will create a light abstraction in order to show how this can be used. Just as an example we will create the concept of data attached to a location. Something like this:

struct LocationData {
int x, y, z;
std::unique_ptr<uint8_t[]> data = nullptr;
std::size_t size = 0;
};

The meaning of the location is not important here, just that this is a relevant structure somewhere out there. There’s a few things we should know before continuing:

  1. All heap allocations are 64-bit aligned. We control that.
  2. We want the ability to modify the data and commit it only once everything is completely done
  3. … so we want a copy of the data into our script program.

For the minimal API we will need 2 functions:

// A function that retrieves the contents of a location (x, y, z),
// or returns (nullptr, 0) if the location is not found.
struct LocationGet {
uint8_t* data;
size_t size = 0;
};
DEFINE_DYNCALL(10, location_get, LocationGet(int, int, int));
// A function that commits the contents of a location
// It cannot return an error, instead it will throw an exception.
DEFINE_DYNCALL(11, location_commit, void(int, int, int, const void*, size_t));

The ABI says we can return a 2-element struct directly in registers, which is efficient. Hence, we will return data and size directly as a return value from the host. We will also try to keep x, y and z in registers by passing them all as arguments.

In order for this to be somewhat OK to use, we will create a simple class:

#include <span>
struct LocationData {
LocationData(int x, int y, int z)
: x(x), y(y), z(z)
{
auto res = location_get(x, y, z);
if (res.data) {
m_data.reset(res.data);
m_size = res.size;
}
}
void commit() {
location_commit(x, y, z, m_data.get(), m_size);
}

bool empty() const noexcept {
return m_data == nullptr || m_size == 0;
}
std::span<uint8_t> data() {
return { m_data.get(), m_size };
}
void assign(const uint8_t* data, size_t size) {
m_data = std::make_unique<uint8_t[]>(size);
std::copy(data, data + size, m_data.get());
m_size = size;
}

const int x, y, z;
private:
std::unique_ptr<uint8_t[]> m_data = nullptr;
std::size_t m_size = 0;
};

This simple and straight-forward class will manage our data for a given location. Constructing it will try to fill in the blanks, and if empty, we can assign data as we want. Committing the data will overwrite the engines data with what we have, regardless of size.

On the game engine side we can implement this very quickly:

struct Location {
int x = 0, y = 0, z = 0;

bool operator==(const Location& other) const {
return x == other.x && y == other.y && z == other.z;
}
};
namespace std {
template<> struct hash<Location> {
std::size_t operator()(const Location& loc) const {
return std::hash<int>()(loc.x) ^ std::hash<int>()(loc.y) ^ std::hash<int>()(loc.z);
}
};
}
struct LocationData
{
std::vector<uint8_t> data;
};
static std::unordered_map<Location, LocationData> locations;

And the callbacks for location_get and location_commit, in the game engine:

// This is the callback for sys_location_get
register_script_function(10, [](Script& script) {
auto [x, y, z] = script.machine().sysargs<int, int, int>();
auto it = locations.find(Location(x, y, z));
if (it != locations.end()) {
auto alloc = script.guest_alloc(it->second.data.size());
script.machine().copy_to_guest(alloc, it->second.data.data(), it->second.data.size());
script.machine().set_result(alloc, it->second.data.size());
} else {
script.machine().set_result(0, 0);
}
});
// This is the callback for sys_location_commit
register_script_function(11, [](Script& script) {
auto [x, y, z, data] = script.machine().sysargs<int, int, int, std::span<uint8_t>>();
// This will create a new location or update an existing one
auto& loc = locations[Location(x, y, z)];
loc.data = std::vector<uint8_t>(data.begin(), data.end());
});

And now we need to stop and explain a bit. The callbacks in the game engine retrieve arguments from the calls made by the program, and then do something useful depending on which function got called. Imagine that the program does this:

LocationData loc(1, 2, 3);
if (!loc.empty()) {
printf("Location (1, 2, 3) contains %zu bytes\n", loc.data().size());
location_commit(1, 2, 3, loc.data().data(), loc.data().size());
} else {
printf("LocationGet(1, 2, 3) was empty!\n");
}

std::vector<uint8_t> data = { 0x01, 0x02, 0x03, 0x04 };
loc.assign(data.data(), data.size());
loc.commit();

LocationData loc2(1, 2, 3);
if (!loc2.empty()) {
printf("Location (1, 2, 3) contains %zu bytes\n", loc2.data().size());
location_commit(1, 2, 3, loc2.data().data(), loc2.data().size());
} else {
printf("LocationGet(1, 2, 3) was empty!\n");
}

And the output is as expected:

LocationGet(1, 2, 3) was empty!
Location (1, 2, 3) contains 4 bytes

So the program is first creating a LocationData at (1, 2, 3) and then checking if it’s empty. It was empty. Then it assigns from a 4-byte vector, commits that, creates a new LocationData at (1, 2, 3) and now we see that it contained 4 bytes.

This is the constructor of LocationData:

 auto res = location_get(x, y, z);
if (res.data) {
m_data.reset(res.data);
m_size = res.size;
}

It requests the game engine to execute location_get, which we numbered callback 10. Callback 10 then executes, and it immediately retrieves the first 3 arguments as integers, making up x, y and z. Then it tries to find that in the locations map, and if it can’t find it, it will return (0, 0). You can imagine it as setting two registers two zero, one of which is a pointer on the other side, and the other an integer.

After that, LocationData becomes empty, and that is printed. Then we assign data to it and call commit(). Commit will call location_commit, which is callback 11. If you check the callback, it reads out 4 arguments: x, y, z and a span<uint8_t>. It’s a dynamic span so it consumes 2 registers, making the total 5, which matches the programs definition: void(int, int, int, const void*, size_t). That is, one register for the pointer, and the second for the size of the data. The data is then copied into the map for that location.

Finally, we create a LocationData again for (1, 2, 3), and this time data is found. 4 bytes were stored and retrieved into the program, and the data and size were passed as registers in the 2-member return struct. In the callback on the host-side, there are more things going on now:

  1. A heap allocation for the script program is made using script.guest_alloc(bytes).
  2. We then copy our data into the address from the heap allocation using script.machine().copy_to_guest(…).
  3. Finally, we return the address and the size.

And that concludes the first advanced scripting example.

One final note about this example API is that it is fairly abuse-resistant. If we expect antagonistic behavior, we should have put a limit on the number of locations that can be created, but other than that, the libriscv API does a good job preventing extreme values. For example, a huge span will immediately fail execution. Invalid memory reads or writes will fail execution. Attempts at stalling too long, also failed execution. It bears mentioning that there have also been times where I’ve simply managed to muck things up badly enough that I ended up in an infinite loop, but it never affected my game. It runs for a bit, and then it fails by itself, and tells you where it stopped.

2. Remote procedure calls

Executing the same code in two separate places, on two separate systems and experiencing the same behavior is simply correct emulation. Further, if we assume the same program, we may even describe a function call, and have it behave the same way. So let’s try making remote procedure calls happen, but with two sandboxes running the same program, for testing purposes.

First off, we must define how we want to execute a function in another place. For example, we could copy all registers from one virtual machine to the other, and then just vaguely say that all arguments that fit in registers will be passed to the remote function. Then execute the same function, after all the programs are the same, so the functions are the same.

Another method is the one we will be doing. Using fixed-size lambda capture, we can copy the things we care about (by value!) into a lambda function, transport that to the remote location and pass it as an argument to a trampoline. From there the trampoline can call the lambda with the capture, and voila. Remote procedure calls.

Let’s create a simple API like this:

int x = 42;
rpc([x] {
printf("Hello from a remote virtual machine!\n");
printf("x = %d\n", x);
fflush(stdout);
});

In a real setting, the RPC function would have many modes, including the ability to broadcast it to nearby or region-wide players, for example. But this is good enough for our example. In order to implement this, we need a fixed-size capture function implementation (It’s a link). Using that, we can create a helper function that uses a callable as a trampoline, like so:

DEFINE_DYNCALL(12, remote_lambda, void(void(*)(void*), const void *, size_t));

static void rpc(riscv::Function<void()> func)
{
remote_lambda(
[](void* data) {
auto func = reinterpret_cast<riscv::Function<void()>*>(data);
(*func)();
},
&func, sizeof(func));
}

What’s happening here is that we call remote_lambda with an inline function that can be converted to a function pointer, which takes a pointer as argument that we can convert to our fixed-size capture function. And then call it. And, the second and third argument is the function and its size. So, 3 arguments: A function pointer, a Function<void()> pointer, and its size.

On the host we read out the capture storage, and retrieve the address as an integer:

static Script::gaddr_t         remote_addr;
static std::array<uint8_t, 32> remote_capture;
...
register_script_function(12, [](Script& script) {
auto [addr, capture] = script.machine().sysargs<Script::gaddr_t, std::array<uint8_t, 32>*>();

remote_addr = addr;
remote_capture = *capture;
});

Here we retrieve the function pointer address, and then a zero-copy pointer to a 32-byte std::array. Internally, a fixed-size 1-element span is created in order to verify that this memory is in-bounds. Then it’s converted to our array after alignment checks.

With this, we now have the ability to call the function in our program with the capture storage we retrieved. So, we create another Script instance to call the function:

auto script2 = script.clone("myscript2");

// Call the remote function, with the capture pushed to stack
script2.call(remote_addr, remote_capture);

Calling remote_addr with the capture storage pushed on the stack results in a successful remote procedure call:

Hello from a remote virtual machine!
x = 42

This shows that when two programs are the same, you need only a function address and capture storage to implement safe and reliable RPC. This is a feature that I use extensively in my game development.

3. Implementing callbacks

There are two ways to create callbacks into the script that you want to call later, based on some event.

First, simply looking up a function by name, and calling agreed-upon arguments is the easiest, but also slightly error prone:

myscript.call("my_function", 1, 2, 3, "four");

For this method you don’t need to do anything but implement the above function my_function as extern "C" in the script. As long as it is visible (has a symbol table entry), it can be called. What we don’t know is if the function actually takes those arguments, but this method is very easy to use from eg. JSON.

The second method is to get a function pointer from the script itself, but that requires creating a callback-registration function. Like so:

DEFINE_DYNCALL(13, my_callback, void(const char*, void(*)(int)));

Just as an example we take a string first, to identify eg. an entity in the game, then a function pointer. The function pointer will be called when the event happens and we will pass in an integer that is the ID of the entity. Implementing the handler on the host:

register_script_function(13, [](Script& script) {
auto [name, func] = script.machine().sysargs<std::string, Script::gaddr_t>();

// Find the entity by name
auto& ent = entities.at(name);
// Register some event handler for the entity
ent.on_event(
[func, &script] (auto& ent) {
// Call the function with the entity ID as an argument
script.call(func, ent.getID());
});
});

Script::gaddr_t here is an unsigned integer the size of a pointer inside the sandbox. Of course, we prefer to use matching pointer sizes in order to avoid problems when passing structs (eg. size_t differences).

So, with this callback on the host, once called it will find an entity, set its on_event callback to make a script function call with the provided function pointer from before, and then we pass the entity ID as argument.

Inside the script we can now create an event handler like this:

my_callback("entity1", [] (int id) {
printf("Callback from entity %s\n", Entity{id}.getName().c_str());
});

For this example we pretend we have a wrapper for Entity that just holds the ID, and then with that we could for example get the entity name from that wrapper.

The third and final method for callbacks is the same as the above, but we use a Function<void()> in the script, we pass the capture storage to the event handler by copying it into the lambda on the host, and then when the event is triggered we push the capture storage on the stack during the function call. Similar to the RPC example. With that, you can have events with capture storage. Very handy!

Let’s finish this off by implementing that step by step. First, modify the definition in the script to take an extra const void*, size_t which will be the capture storage, and then modify the function pointer by appending a void* argument which is how we will take the capture storage back in again later:

DEFINE_DYNCALL(13, my_callback, void(const char*, void(*)(int, void*), const void*, size_t));

It got complex, but if you take it in two steps, know that the second step is always the same every time. You add const void*, size_t to the callable and append void* to the callback function pointer. In order to be able to use this callable we should create a helper function:

static void entity_on_event(const char* name, riscv::Function<void(int)> callback)
{
my_callback(name,
[] (int id, void* data) {
auto callback = reinterpret_cast<riscv::Function<void(int)>*>(data);
(*callback)(id);
},
&callback, sizeof(callback));
}

The helper function makes the complex call simple, by adding an intermediate function that casts the function back to itself, and calls it with the regular arguments. It could also pass back return values.

int x = 42;
entity_on_event("entity1",
[x] (int id) {
printf("Callback from entity %s\n", Entity{id}.getName().c_str());
printf("x = %d\n", x);
});

Just like in the RPC example we print x = 42. In order for this happen we now need to expand the host-side too:

register_script_function(13, [](Script& script) {
auto [name, func, capture] = script.machine().sysargs<std::string, Script::gaddr_t, std::array<uint8_t, 32>*>();

// Find the entity by name
auto& ent = entities.at(name);
// Register some event handler for the entity
ent.on_event(
[func, &script, capture = *capture] (auto& ent) {
// Call the function with the entity ID as an argument
script.call(func, ent.getID(), capture);
});
});

The only change now is that we copy the capture storage by value into the on_event lambda, and then place it as the last argument where it will be pushed on stack. It is the void* argument. In this example I actually retrieve the array as a pointer: std::array<uint8_t, 32>*, and that gives us a zero-copy pointer to that data. But we have to capture it by value.

Thanks for reading. I will go through more examples in later parts.

-gonzo

--

--