magicfoodhand

Rigz is Live

rustrigz

TLDR; Checkout docs.rigz-lang.org or repl.rigz-lang.org to get started with it.

Rigz is by far the biggest personal project I've ever undertaken and still has a lot to do; it's over 10,000 lines of Rust spread across six crates (recently merged back into a monorepo). There is a lot to talk about so strap in.

First Takeaway

If you're building a heavily coupled project, just start with a monorepo. Moving to one later without losing commits was annoying and it lead to revving through a lot of versions in the VM beforehand (0.34.0) for 0.4.0 of the language. Once I get to 1.0, I might move away from the monorepo but for now it's here to stay.

What is Rigz?

Rigz is a scripting language heavily inspired by Ruby, Kotlin, and Rust; like a static language it needs to do a lot of checks before the program can run (mainly because of the scope based design, more on this later). I had three goals with this language:

  1. Learn a ton of Rust
  2. Create a language that works the way I want it to
  3. Rust based modules for functionality instead of relying on FFI.

Learning

The best way to learn is to do, so I needed a substantial project in Rust. I wanted to learn macros and understand how a project changes as it gets larger, as a consequence of how I wanted to build the VM I also had to learn about RefCell. RefCell essentially moves the borrow checking to runtime, instead of compile time, so that I could get around the "only one mutable reference" or "cannot borrow as immutable since it is borrowed as mutable" errors, both were crucial to get the VM to behave the way that I wanted. I'm sure there is another way I could've handled this, ideally at compile time, but this is the option I've chosen for now.

My Ideal Language

Rigz isn't my ideal language yet, but it's getting closer. I wanted a language that could be used for terraform like runs (plan and apply stages, more on this in lifecycles), could be used for policy as code (like sentinel or rego/OPA), and resumable programs on timeout. I also knew the syntax that I wanted, it should feel like Ruby as much as it can while leveraging Kotlin's extension functions, as well as looking similar to Rust at times but that's mostly because I wanted the shortest keywords that made sense in the situation. fn for functions, fn mut Type.foo for a mutable extension function, one line scope definitions and the elvis operator ?: from Kotlin fn foo = none ?: "hello world", immutable variables by default following Ruby's syntax but let to be explicit and just mut instead of Rust's let mut. The language is definitely quirky but as it comes along I think it'll be more clear how it's intended to be used, and docs.rigz-lang.org will continue to be updated.

Resumable Programs

Currently there are two main ways to handle errors, exceptions and errors as values. What if there were a third option? It would be ideal if every error could be resolved (either through an automated process or manually) and the program could be resumed from that point on. Nothing drives me crazier than writing a long running script that fails on the last line so I need to rerun the full thing or setup a debugger to help it along. To allow most of a program to run when it can I've decided to use errors as values for Rigz with the exception of one specific error, TimeoutError. There is a still a lot of work to do here; the goal is that when I offer a hosted version of Rigz that times your function out after a set duration (free vs premium tiers), you'll be able to resume from the exact moment that timeout occurred (sometimes you may have to change your code to fix this error and that will require starting from the beginning).

Macros, Macros, and more Macros

One of my favorite things about software development is meta-programming, my simple definition will always be code that writes code. Rigz will support this eventually, I really like the way Ruby handles it but Rust Macros are nice too. Maybe there's a middle ground here. Liking macros has lead to a ton of macro usage within the crates. I use them for parameterized testing in Rust, generating From/Into definitions, and even the VM builder is backed by a macro, but my favorite has to be the ast_derive crate.

Procedural Macros

Rust procedural macros let you do almost whatever you'd like so ast_derive uses this to generate a trait definition and parse your Rigz trait definition (interfaces) at compile time. The obvious benefit to this is that the standard modules must be valid before I can run anything, but it also makes the startup much faster. Originally I had to parse every module then I could start parsing the input, for a simple 2 + 2 program this added 20 micro seconds on my machine (which isn't a ton but considering module usage will only increase as time goes on this was untenable and mlua is nowhere near that slow for a simple operation). There was also another unfortunate issue with modules, if your trait definition got out of sync with the definition weird things would happen, luckily both of these issues are solved with the macro. Let me know show you how it works:

use rigz_ast::*;
use rigz_ast_derive::derive_module;

derive_module!(
r#"trait JSON
fn Any.to_json -> String!
fn parse(input: String) -> Any!
end"#

);

impl RigzJSON for JSONModule {
fn any_to_json(&self, value: Value) -> Result<String, VMError> {
match serde_json::to_string(&value) {
Ok(s) => Ok(s),
Err(e) => Err(VMError::RuntimeError(format!("Failed to write json - {e}"))),
}
}

fn parse(&self, input: String) -> Result<Value, VMError> {
match serde_json::from_str(input.as_str()) {
Ok(v) => Ok(v),
Err(e) => Err(VMError::RuntimeError(format!("Failed to parse json - {e}"))),
}
}
}

The derive macro generates a struct named JSONModule, a trait named RigzJSON that needs to be implemented, and an implementation for ParsedModule that includes that definition exactly as parsed. It also handles type conversion, error handling, variable arguments, and mutable extensions correctly. As time goes on I'm sure it will do more, the goal is that no one wants to implement Module manually and just relies on this macro.

Scope based Register VM

If you've ever looked into creating your own language, or language VM, you know that most of them store the instructions in one list like structure. Rigz is different, all of your instructions are stored in scopes; this gives some obvious benefits like knowing when a value is no longer used and can be cleaned up but it's also the main source of complexity. Every if/else expression requires two scopes (one for each branch), all functions have their own scope and supports polymorphic functions (like Java or Elixir there can be multiple functions with the same name) so the function that is meant to be called has to be figured out before the program can run.

Takeaways

When building your own VM I'd recommend starting with a stack based VM, that follows the traditional pattern. I'd be a bit further along but I really wanted a register based VM, for another version of this idea checkout fn_vm.

Upcoming Features

There are a lot of things I plan to add but here's a small snapshot.

Multi File

Currently you're only allowed to have one rigz file, this will be fixed soon by expanding imports, implementing exports, and creating a rigz registry for sharing code.

Dynamic Modules

Currently there are five built in modules, Rust Code that extends rigz through a trait definition; Std (which provides most functionality and is auto-imported), File, JSON, Log, and VM. docs.rigz-lang.org will be updated to show how to use them soon. I mentioned that I didn't want FFI to be required, well that was sort of true. The core library won't use FFI at all (unless I decide to follow the namesake of the project; Rigz = Rust + Zig), but that shouldn't stop people from using it if they want it. Once the registry is created, later versions will allow you to depend on custom modules too.

Optimized VM Output

There are two aspects to this; currently a lot of move and load instructions are generated to handle values and to fix an issue with recursion I had to move the registers to the CallFrame, instead of being stored exclusively in the VM. Although the parser using a simple last + 1 to determine the next register, it could be optimized to use a set number of registers and push values to the stack if needed. Less instructions leads to a simpler program to debug and a performance improvement, since most of the register values are references to other registers that need to be resolved once used.

Bug Fixes

My hope is that by fixing the instruction output this issue is mitigated but there is one glaring bug I've found with recursive functions. Consider the following rigz code for a fibonacci function:

fn fib(n: Number) -> Number
if n <= 1
n
else
b = n - 2
(fib n - 1) + fib b
end
end

fib 6 # 8

Why do I need to assign b = n - 2 before the second recursive fib call? It appears that the issue is that the (fib n - 1) call changes the value for the current executing frame since (fib n - 1) + (fib n - 1) will give you the correct answer. This will be fixed in a later version, but it's more important to me that the language starts moving instead of getting hung up on this.

There are almost certainly other bugs or unexpected edge cases in Rigz, the biggest one I'm sure of is around type matching at parse time while generating the VM instructions but I haven't written enough Rigz to find it yet.

Developer Tooling

While there is a syntax highlighter written with tree-sitter, this mainly targets people using vim (or more likely Neovim) or the command line repl. I'd like to add support for VSCode and Intellij, as well as markdown highlighting so I don't need to use Ruby's highlighter. Solving for VSCode will also likely solve the online REPL lack of highlighting as I figure out how to incorporate Monaco. The two aspects that I'll be working on first are the LSP server and the debug command, then I'll circle back to improving the experience all around. Currently the main feature I'll need here are source maps so that it's easy to map between what the VM knows about and the source code.

Lifecycles

Lifecycles are annotations that change the behavior of a program or extend functionality. Currently the only implemented lifecycle is @test, that will be expanded even more, but there are three other types that I want to support out of the box:

@on

The @on lifecycle will be for event handlers, the current debate is whether these are synchronous or async and how that will impact the rest of the VM. It's much simpler not to support async at all but I'd prefer an actor model (message passing) for asynchronous code.

Custom Lifecycles Leveraging @after

This is how I will support plan and apply type runs, by default all functions are available across all lifetimes but if you define the same function it will use the one in that lifecycle. I'm still working through what I want the syntax to be here, but it will probably be something like this (command line, leaning towards Ruby or Python for this but might end up more bash-like, and function arguments built-ins, like Javascript's arguments variable, are also still up in the air):

@plan = @after @parse
@apply = @after @plan && ARGS[:option].is :apply

@plan
fn AWS.s3_buckets(prefix: String) -> [S3Bucket]
...
end

The goal is that the standard library is full featured so you can write whatever you'd like without the feeling that a 3rd party library required, although you'll have that option.

@memo

This is by far the simplest lifecycle to support, memoize the arguments and results to support dynamic programming out of the box, but the question becomes how configurable should this cache be?

Final Thoughts on Lifecycles

Lifecycles are the feature I'm most excited about in the language but I'll need to write a lot more rigz to figure out exactly how they work. At some times they're event handlers, others state machines, and more; but ultimately they will unlock the super powered features in rigz.

Next Steps

I'm excited by where the language is going and where it's going to end up, this won't be just a hobby project anymore I've got to see it through the end. There are four immediate areas where this language will be used; Advent of Code (if this language can't do advent of code, I've still got work to do), Polc (my policy as code tool), Migrations (a tool for active record like database migrations without ruby or rails), and a hosted version offered on In a Pinch. Follow along on Gitlab, Github, or rigz-lang.org (this is currently empty but provides links to useful resources). Happy coding!