libriscv is a C++20 RISC-V userspace emulator. It runs many kinds of RISC-V userspace programs. And DOOM, of course.
I just finished dynamic ELF support for libriscv, and with that I pressed the button on version 1.0, after many years of development!
libriscv can be used in many situations now, one of them being as a CLI for running RISC-V programs as shown above.
Command-line interface
The CLI is in the emulator folder. Running build.sh
should be enough to get access to ./rvlinux
which can run both static and dynamic RISC-V ELF programs.
The CLI has some hidden options through environment variables:
DEBUG=1 ./rvlinux myprogram
With DEBUG=1
one can run a program step by step, instruction by instruction. It’s possible to set breakpoints and trap on reads and writes. If you run it with a RISC-V executable that has a visible main function, it will actually run to main and break there:
$ DEBUG=1 ./rvlinux ../binaries/STREAM/stream
*
* Entered main() @ 0x10118
*
>>> Breakpoint [0x10118] df010113 ADDI SP, SP-528 (0xF606D90)
If you don’t want that behavior, use FROM_START=1 DEBUG=1 ./rvlinux
instead! This debugger is a wrapper on top of a riscv::Machine that you can modify for your own needs. See: https://github.com/fwsGonzo/libriscv/blob/master/lib/libriscv/debug.cpp
Debugging with GDB
Running the CLI with GDB=1 will open TCP port 2159 so that GDB can attach using target remote :2159
. The exact sequence in GDB is pretty straight-forward:
gdb-multiarch zig/http
target remote :2159
...
(gdb) break os.exit
Breakpoint 1 at 0x5f952: file /zig/lib/std/os.zig, line 666.
(gdb) c
Continuing.
Breakpoint 1, os.exit (status=0 '\000')
at /zig/lib/std/os.zig:666
666 linux.exit_group(status);
(gdb) bt
#0 os.exit (status=0 '\000') at /zig/lib/std/os.zig:666
#1 0x000000000005908e in start.posixCallMainAndExit ()
at /zig/lib/std/start.zig:425
#2 0x0000000000000000 in ?? ()
Normal GDB commands shown above. Many languages embed pretty printers in the executable, allowing natural debugging in GDB.
While the program was being executed instruction, it fetched example.com via HTTPS:
$ GDB=1 ./rvlinux ../binaries/zig/http
GDB server is listening on localhost:2159
GDB is connected
<!doctype html>
<html>
<head>
<title>Example Domain</title>
I was using the HTTPS example from the Zig tests:
const std = @import("std");
var gpa = std.heap.GeneralPurposeAllocator(.{ .stack_trace_frames = 12 }){};
const alloc = gpa.allocator();
pub fn main() !void {
defer _ = gpa.deinit();
// our http client, this can make multiple requests (and is even threadsafe, although individual requests are not).
var client = std.http.Client{
.allocator = alloc,
};
// we can `catch unreachable` here because we can guarantee that this is a valid url.
const uri = std.Uri.parse("https://example.com/") catch unreachable;
// these are the headers we'll be sending to the server
var headers = std.http.Headers{ .allocator = alloc };
defer headers.deinit();
try headers.append("accept", "*/*"); // tell the server we'll accept anything
// make the connection and set up the request
var req = try client.open(.GET, uri, headers, .{});
defer req.deinit();
// send the request and headers to the server.
try req.send(.{});
try req.wait();
// read the content-type header from the server, or default to text/plain
const content_type = req.response.headers.getFirstValue("content-type") orelse "text/plain";
_ = content_type;
// read the entire response body
const body = req.reader().readAllAlloc(alloc, 1024 * 1024) catch unreachable;
defer alloc.free(body);
std.debug.print("{s}", .{body});
}
Compiled with:
zig build-exe -O ReleaseSafe -target riscv64-linux http.zig
That’s a nice one-liner! Funnily, the program leaks memory at the end, both natively and on RISC-V. I thought I had a bug at first!
Windows support using Clang-cl
With Clang-cl I was able to build the libriscv emulator on Windows, and run quite a few programs, including Go RISC-V programs.
libriscv is a CMake project, so it made sense to use the CMake features of Visual Studio to do this.
In order to support the full RISC-V B-extension I had to upgrade to C++20 in order to support Windows, largely due to the <bit> header. It’s fine though, C++20 is everywhere now. Unless you’re paying for extended support.
Not too surprising, but a data point nonetheless: it performed about as good on Windows as Linux, and I was running in a VM!
Introductory C API
There is now a C API that can be used to more easily integrate libriscv into projects using other languages. The API header is here. It uses CMake to build a new library called libriscv_capi
, which embeds libriscv into itself and presents a public C header.
There is a test project that uses the C API as part of a CI job.
RISCVOptions options;
libriscv_set_defaults(&options);
options.max_memory = 4ULL << 30; // 4 GiB
/* A RISC-V machine */
RISCVMachine *m = libriscv_new(buffer, size, &options);
if (!m) {
fprintf(stderr, "Failed to initialize the RISC-V machine!\n");
exit(1);
}
/* Program execution */
libriscv_run(m, UINT64_MAX);
libriscv_delete(m);
The C API will become more fleshed out over time, but for now it does the job and it has the simplicity that one expects. If you check out the example project you will find callbacks for error handling, stdout output and so on.
add_subdirectory(.. riscv)
add_executable(test test.c)
target_link_libraries(test riscv_capi)
The test project is also CMake, and works well in-tree, but it lacks installation procedures. It’s possible to add the C API into libriscv itself, but then I believe projects using CXX only will start failing.
Dynamic ELF loading
Running dynamically linked executables has always been part of the plan, but I had never wanted to write a dynamic loader. Thankfully, I found a way to support ld-linux.so (for RISC-V) by implementing many advanced mmap features, and instead focused on supporting dependency-less dynamic executables, which ld-linux.so is. It would also support static PIEs that are becoming more common.
In fact, the solution is pretty stupid. I check if the ELF is dynamic, and then I swap out the loaded program with the dynamic linker .so (loading it from the filesystem), and then add the real program as argument to the dynamic linker!
if (binary[EI_CLASS] == ELFCLASS64 && is_dynamic) {
// Load the dynamic linker shared object
binary = load_file(DYNAMIC_LINKER);
// Insert real program name as argv[1]
args.insert(args.begin() + 1, args.at(0));
// Set dynamic linker path to argv[0]
args.at(0) = DYNAMIC_LINKER;
}
This, it turns out, works really well! The dynamic linker will do all the hard work, while I sit back and do nothing. If you’re interested in this solution for other projects, keep in mind that the dynamic linker will use file-based memory map calls to load the program and its dependencies.
I decided against emulating the file-based aspect, and instead I just copy the file contents straight into guest memory in order to avoid a lot of state-keeping. It wasn’t out of fear for sandbox integrity, rather just to avoid having to manage real memory mappings.
As a bonus, this method should work on Windows too. So yes, you can now in theory run dynamically linked RISC-V binaries on Windows, but you will need to get the dependencies (including the dynamic linker) with the binary.
Embedding with CMake
Embedding a project with CMake should be really straight-forward. I support both FetchContent and add_subdirectory().
include(FetchContent)
FetchContent_Declare(libriscv
GIT_REPOSITORY https://github.com/fwsGonzo/libriscv
GIT_TAG master
)
FetchContent_MakeAvailable(libriscv)
add_executable(example example.cpp)
target_link_libraries(example riscv)
See the example project that embeds libriscv master branch from git. I always liked this way in order to quickly test a library.
Debian Packaging
There is an attempt at Debian (and Ubuntu) packaging by using CPack in the root CMakeLists.txt. I made sure to write the current CMake configuration to a generated header-file, which is then packaged with the compiled library. I also created a .pc file that I ended up hard-coding the install paths on, so I think I need some help in doing that correctly, especially for other distributions.
So far though, the generated .deb file seems to be working really well, and there is a CI job that tests this by installing it and referring to it using a simple CMake project:
find_package(PkgConfig REQUIRED)
pkg_check_modules(libriscv REQUIRED IMPORTED_TARGET libriscv)
add_executable(example example.cpp)
target_link_libraries(example PkgConfig::libriscv)
I hope I did that bit correctly, as it looks pretty nice! The example project can be found here.
There is also the hope that CPack can be used to generate packages for other distros as well.
Lowest latency
The most well trodden path is still running static executables for low latency emulation as I am using libriscv as a scripting backend for a game I am making. Performance should be quite good for being an interpreter right now, and I am quite happy with where things are. Hence v1.0!
So, if anything, expect improvements and cool solutions that target latency, and not necessarily performance!
Ever called a script function a billion times? If the answer is yes, then you know how painful it is to use sandboxes that has some kind of hidden cost (eg. signals, stack switching, even timers) before being able to enter the code. Completely hopeless when it matters. Calling a guest function (a VM call) in libriscv is routinely ~4ns in practice and pause/resume (the equivalent of entering and leaving a coroutine) is ~3ns. In my last blog post I measured a practical VM function call (one that is actually used in the game), with around 17 instructions, to run in 8–12ns in a concurrent part of the engine. What I mean by that, is that I had to select a thread-local emulator and then invoke the function to (safely) manipulate some engine state.
Among sandboxes, it is probably the lowest latency by a large margin. With sufficiently small script functions, it is a competitive feature. Even natively running coroutines, which are the fastest we can imagine, will use 10ns to context switch. However, this is also about sandboxing, and there is no competitive low-latency sandbox that I know of.
Anyways, that’s it for me. Thanks for reading.
-gonzo