magicfoodhand

Building a Language in 10 Days

challengerigz

By definition the challenge was a success but it did fail to meet my planned expectations, I streamed the challenge for all 10 days and you can "use" rigz now but I wouldn't recommend it. Let's talk through what went well, what didn't, and how I'd recommend doing the challenge yourself.

How to make a language

As I mentioned in Rigging Things Together, there are 3 main parts to an interpreted language; a lexer, parser, and runtime. There were a few resources I used before completing this challenge, rust-lang-dev, crafting interpreters, the rust book, and Structure and Interpretation of Computer Programs. From the first link I decided I'd use logos to simplify my lexer since "Logos has two goals: To make it easy to create a Lexer, so you can focus on more complex problems and to make the generated Lexer faster than anything you'd write by hand". Well sign me up. I knew that for the runtime I wanted to create a register based VM where instructions were a defined as variant in an enum, loading values into registers and then all operations using those registers, I think there are faster ways to do this but it's what I wanted. The parser then acts as the glue between these two layers, this was relatively simple but suffered by the notion that I'll figure it out as I go.

Rewriting Rigz

Of the 10 days I'd say half of them were well spent, at least 3 days wasted trying to chase a specific implementation instead of something that worked. The implementation was to handle parsing, it was going to be a state machine that used the lexed tokens as input and states were the valid elements of the language (in the final version either a statement or expression, but the first version tried to handle every valid element in the grammar i.e. let, mut, function calls, definitions, etc.). While the resulting parser is similar, I'd call it an inline AST parser (Abstract Syntax Node parser doesn't sound as nice) since it operates on specific elements and writes out VM instructions as soon as the element is valid, the state machine parser failed for two reasons. The first is a combinatorial explosion of possibilities, because I didn't have the states or tokens well defined any change was a nightmare to think through and rewrite. The second is a variation of the first, but I was viewing the problem incorrectly so every state and token required hundreds of combinations (30 tokens and 10 states is quickly a nightmare, let alone peeking the next token after that). The solution was to start breaking out components, handling values and identifiers specifically, but it was simpler to switch to a standard parser (the current parser only supports two element types, those types handling the specifics underneath).

Takeaway: Get it working

When working on a tight time crunch the only thing that matters is that it works, it doesn't need to be optimal or even good code. Assuming you're going to rewrite things later this is doubly true, but if you know a rewrite isn't going to happen spend a bit more time planning up front (there's nothing wrong with throwing away prototypes, it's much better than shipping one). As always the trick in software is simplicity, I should've used the first working thing I could get and the language could've gotten much further. Who cares if it's not the faster parser ever written? As a side note every time I got stuck I started a new project from scratch testing out various ideas, once I figured it out I'd apply that logic to the stream the next day never planning to review the test code again. Try not to refer to the prototype, the act of writing it out is all you need. Hacking out a solution helps the project more than any amount of thinking or whiteboarding will.

Takeway: Refactor

I know it's a 10 day challenge, there's only time to sprint towards the goal but this week proved out once again that "moving slow to move fast" is the best way to go. Had I refactored earlier on it would've been easier for me to keep the 3,500+ lines of code in my head to know what I needed to do next and what was missing, especially once I started to see some of the higher level patterns in the parser. While I do know around 2500 lines of it, specifically the VM, that last 1k is what kept me on my toes.

Missing Features

If you try to use rigz today, cargo install rigz && rigz run --main hello.rigz, you'll find that you can do basic binary operations (not chained) and print out to std out. How is this considered meeting the challenge? Because you can use it; Brendan Eich created a prototype of JS in 10 days, how much of that code do you think is still in use? I'd wager almost none of it so I'm excited for where the language goes from here. My sights are set high, you will be able to use this language soon (polc, rigz-db, and rigz-scraper are all in the works once the language is stable). I'll have to polish the VM, simplify the parser, write a few modules to perform stdlib like features, improve the CLI and error reporting (errors are values but parse errors hide the line), then finally working on documentation and embedding the runtime in a webpage. A lot of work to go, but it should go much faster from here on out.

Streaming

My favorite part of the challenge was interacting with people in chat, this was by far the highlight. Answering questions or just talking about programming were my favorite. However it is important to mention that streaming the challenge is the one thing I'd change if I did this again. On stream I mentioned a few times that context switching was difficult, focusing and chatting, unfortunately doing a timed challenge works against you here. It's difficult for the audience to jump in the middle of a coding project and if you're staring at the screen thinking of the next change they might not stick around to hear your next explanation. The other downside is that I had to change my environment to be better on stream, zoomed in IDE and only using one monitor so I could check OBS/chat on the other. I'm very excited to keep streaming but I won't be streaming full projects anymore, just components while I work through making a better Twitch stream and hopefully get to affiliate status.

I'm still an amateur here and have lots to learn, but originally I had viewed streaming code like a coding interview. I'd explain what I was working on and cruise through it, this only worked when I planned the changes the night before (5/10 of the good streams) and usually around the 2/3 hour mark I'd find I missed a step or hadn't planned that far ahead. Lucky for me most coding interviews are a bit simpler than create a custom programming language using a VM in 4 hour chunks.

Stream Advice

Again I have 10 days of experience here so I'm by no means the expert but there were two things that had a direct influence on my viewers. The first obvious one is a good title, you won't get anyone stopping by if the title isn't interesting. This wasn't a guarantee they'd interact, still working through that piece, but it got them in the door. The second one is that getting frustrated on stream pushed people out, towards the end (when my new keyboard and my personal forced use of Windows for streaming) it was hard not to remember that I could be way more productive if I were my ideal setting but voicing that opinion or getting mad at the borrow checker for the nth time because you missed a lifetime when you added that second parameter to the previously elided function.

Rust Rant

There were two specific issues in Rust that kept slowing my progress, the first being uneliding parameters to make the compiler happy. Here's a snippet showing the problem. I'd love to hear from rustaceans why this isn't the way that rust works.

struct VM<'vm> {}

struct Foo<'vm> {
field: &'vm str,
}

struct Bar<'vm> {
field: &'vm str,
}

impl VM<'vm> {
// this works as expected, Bar is shorthand for Bar<'vm>
fn foo(&mut self) -> Bar {
//
}

// Foo & Bar should be Foo<'vm> and Bar<'vm>, but they aren't so you'll get a compile error. If I leave it out, use my impl for the lifetime.
fn fubar(&mut self, foo: Foo) -> Bar {
//
}
}

The second issue is much more obvious, cannot borrow something as mutable if it's borrowed as immutable and vice versa, but I disagree with the reasoning for nested fields. Another snippet.

struct VM<'vm> {
current: CallFrame,
scopes: Vec<Scope<'vm>>,
}

struct CallFrame {
scope_id: usize,
pc: usize // default to 0
}

impl <'vm> VM<'vm> {
// elided lifetimes should work here, but again they don't for the result in an option
fn instruction(&self) -> Option<&'vm Instruction<'vm>> {
let pc = self.current.pc;
match self.scopes.get(self.current.scope_id) {
None => None,
Some(s) => s.instructions.get(pc)
}
}

fn run(&mut self) -> Result<(), Error> {
let instruction = match self.instruction() {
None => return Err(Error()),
Some(s) => s, // immutable reference to self through current
}
self.current.pc += 1; // cannot borrow self as immutable
}
}

Now as I'm reading this code it's quite obvious that borrowing the first as immutable and the second as mutable have absolutely no impact on each other (nothing runs async) but rust can't prove that because ownership is based on self. The work around is to clone the instruction, releasing the immutable borrow so that I can perform the mutable operation on the pc. This is one thing I'll be revisiting but it's a huge source of consternation with Rust, part of me wished that I had written the VM in zig just to get around this issue. I'd love to know an escape hatch so I can tell rust, no I know what I'm doing even if you don't like it. The solution is probably moving the run instruction to CallFrame but maybe I need a larger refactor.

Takeways: Skip Youtube Streams

OBS makes it incredibly simple to stream to Twitch and YouTube at the same time but I only streamed one day to YouTube and made the video private because music I was listening to got picked up on my mic. No one joined on the YouTube side and checking that threw a wrench in my workflow. There are two problems with YouTube streaming, the first is that if you don't have any YouTube content how will anyone find you, and the second is that the integration of a second stream into OBS isn't ideal. You'll need to click go live and end stream in YouTube, not just OBS. Additionally anyone looking to stream on YouTube there are two things you should know. First of all you need to request the ability to stream 24 hours before your first stream, not a big deal but something to keep in mind. The second is that if you want to have clickable links in your video description you'll need to verify your ID, so now Google has a copy of that for up to the next 2 year. If you want to get into streaming build an audience before starting on YouTube, there are way more people on Twitch.

Replicating the 10 day challenge

Anyone can do the 10 day challenge, there are so many tools out there to make creating your language easy (ANTLR, yacc, bison, llvm, and that langdev link has tons of options just for Rust). Unfortunately making it easy makes it complicated, there are a lot of components you'll be learning if you want to take the "easy" path (Obligatory Simple Made Easy mention, while I don't love clojure this is one my favorite talks of all time).

Instead I'd focus on the simple path, read through Crafting Interpreters and use a language you know for the first attempt. Then set clear goals and milestones for your challenge, it doesn't matter if you hit them or not your primary goal is to learn and see how far you can get. Failure teaches so much more than success, but I'm so excited to see other attempts at this challenge blow my language out of the water.

What's Next?

The goal for In a Pinch is to build tools to help developers and small businesses. I'm still deciding on what the first product will be but odds are very high it will be a web scraping platform that uses rigz for the scripting language, until a product is available my main focus will be rigz off-stream. In the meantime please join me on stream as we build random stuff, I need viewers and followers to hit affiliate, the focus for the next two days is a chatbot in Elixir then next week I'll start rewriting River Me This to be a daily game that isn't hosted as a Cloudflare Worker and stealing inspiration from theprimeagen I'm hoping to try Twitch plays River Me This vs Open AI.