Two little interpreters

March 26, 2024

Late last year, I read a few blog posts that said something like “everyone knows that bytecode interpreters are faster than tree-walking interpreters”. And then I saw the paper AST vs. Bytecode: Interpreters in the Age of Meta-Compilation” when Stefan Marr shared a draft on Twitter.

I realized that although I’d written a number of tree-walking interpreters, I’d never actually built a bytecode interpreter before. So, I thought it would be a fun exercise to build two small interpreters in the same language and compare the performance.

The result is Pegboard: Two little interpreters. It’s around 600 lines of TypeScript code: ~180 for the tree-walking interpreter, ~370 for the bytecode interpreter, and a bit of shared code.

I won’t bury the lede: the bytecode interpreter is not faster! In fact, the tree-walking interpreter is about 2–3x faster on my benchmarks. Read on for more details.

The language: parsing expression grammars

Maybe surprising: the “language” I chose to implement isn’t a general-purpose purpose programming language. Instead, I wrote interpreters for parsing expression grammars, aka PEGs. The language of PEGs is almost like simple a programming language — but one whose expressions implicitly operate on some hidden data structures.

It even has something like function calls: rule applications. For example, in the grammar snippet below¹, there are two rules defined: hexDigit and hexIntegerLiteral.

  hexDigit = "0".."9" | "a".."f" | "A".."F"

  hexIntegerLiteral = "0x" hexDigit+
                    | "0X" hexDigit+

The hexDigit rule is defined on line 1, and used (or applied) on lines 3 and 4, in the body of hexIntegeralLiteral.

Rule applications can be arbitrarily deep, and — like function calls — mutually recursive. So an interpreter for PEGs needs to maintain a rule application stack, which is very much like a call stack.

One more detail: for simplicity, I’m not using a custom syntax (like the one above) to specify the grammars. Instead, I use parser combinators. So the rules above are actually defined like this:

{
  // ...
  hexDigit: choice(range("0", "9"), range("a", "f"), range("A", "F")),
  hexIntegerLiteral: choice(
    seq(_("0x"), seq(app("hexDigit"), rep(app("hexDigit")))),
    seq(_("0X"), seq(app("hexDigit"), rep(app("hexDigit")))),
  ),
  // ...
};

Wait, what?

Okay, I realize this could all be kind of confusing, since we don’t usually talk about interpreting grammars. Here’s another explanation:

You can think of a parsing expression grammar as a program, in a very limited programming language, that describes a parser.
You could compile that program to another language. Tools that do this are known as parser generators.
You can also interpret that program. That involves matching the grammar against a particular input string.
Compilers and interpreters typically convert a program from a source language to an abstract syntax tree (AST). With parser combinators, you skip that step and construct the AST directly.

Performance

To measure the performance of my two interpreters, I ported the ES5 grammar from Ohm to the parser combinator style. This allowed me to use some real-world examples: using each interpreter to parse the source code for jQuery, React, and Underscore.

My bytecode interpreter turned out to be slower!

As you can see, the tree-walking interpreter (or, “AST interp” here) is a bit more than 2x faster on V8, and more than 3x faster on JSC. Though I wouldn’t suggest drawing any big conclusions from these numbers.

If you’re interested, you can check out the benchmark source on GitHub.

Node (V8)

cpu: Apple M1
runtime: node v21.2.0 (arm64-darwin)

summary for jquery
  AST interp
   2.86x faster than bytecode interp (switch)

summary for react
  AST interp
   2.24x faster than bytecode interp (switch)

summary for underscore
  AST interp
   2.04x faster than bytecode interp (switch)

Bun (JSC)

cpu: Apple M1
runtime: bun 1.0.26 (arm64-darwin)

summary for jquery
  AST interp
   3.69x faster than bytecode interp (switch)

summary for react
  AST interp
   3.2x faster than bytecode interp (switch)

summary for underscore
  AST interp
   3.17x faster than bytecode interp (switch)

Again, I wouldn’t suggest drawing any big conclusions from this experiment. My main motivation was to get some experience building a traditional, switch-style bytecode interpreter. The fact that it turned out to be so much slower was an interesting surprise!

Some caveats:

There’s a big difference between interpreting a PEG, and interpreting a real programming language. And even though it’s a good real-world use case, the “program” here (the ES5 grammar) is less than 500 lines of code.
I didn’t spend much time trying to optimize the performance of either interpreter. I did use Deopt Explorer to fix some obvious performance problems, but didn’t go much deeper than that. It would be interesting to dig into why the bytecode interpreter is slower — but that’s a project for another day.

You can find the full source of both interpreters on GitHub. If you have any suggestions for improvements, or notice anything major that I’ve missed, please let me know!

Thanks to David Albert for pairing with me to finish up the bytecode interpreter, and to Kevin Lynagh for proofreading this post.

The example is taken from Ohm, the PEG-based parsing toolkit that I co-authored and maintain. Specifically, it’s from the ES5 grammar. ↩