Techniques for writing interpreter loops

For interpreters one of the core components is the interpreter/dispatcher loop. In it's most basic way this is essentially a `loop` combined with a (giant) `match`, but depending on the language there are other techniques available. I think it's worth collecting the various approaches available to Rust, their benefits, drawbacks, etc. Over time we should probably extend this into a document outlining this more clearly, including examples and what not.

@pliniker wrote a bit about this in https://pliniker.github.io/post/dispatchers/, and I discussed things a bit in https://www.reddit.com/r/rust/comments/66h3t2/alternatives_to_dispatching_using_loop_match/.

## Available Techniques

* `loop` + `match`
  * :+1: Easy to implement
  * :+1: Portable (since it's just regular Rust code)
  * :+1: Fairly easy to understand, assuming the `match` isn't 5 000 lines long
  * :-1: Performance may vary depending on the CPU branch predictor, how Rust/LLVM compiles the code, etc
  * :-1: In my personal experience the performance can also be influenced by the number of arms, though this has never been consistent. Sometimes adding an arm would slow things down, then it would speed up again once I added one or two more.
* `loop` plus some form of inline assembly
  * :+1: Probably as fast as you can get things
  * :-1: Not portable as you need to write different ASM snippets for different platforms
  * :-1: Very fragile due to how LLVM operates with ASM
  * :-1: Harder to maintain as you now need to know both Rust and ASM
* computed goto
  * :+1: About as fast as the ASM approach
  * :+1: Assuming the language supports it you can write the code once for all platforms
  * :-1: Rust doesn't support this since it doesn't play well with the borrow checker, and probably won't for a very long time (if ever)
  * :-1: Can be a bit hard to wrap your head around
* Recursive function calls using TCO
  * :+1: Good performance, though I don't remember how well it performs compared to the other techniques
  * :+1: Doesn't require messing with ASM or `goto`
  * :-1: Only enabled when building in release mode
  * :-1: Only available on x86_64 if I remember correctly
  * :-1: It requires that every function is a separate function, which can make it very hard to control the outer loop (e.g. a `break` in a function won't work). This means that to control the loop you'd still need some kind of `match`

There are probably more, but these are the ones I can think of at the moment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Techniques for writing interpreter loops #3

Available Techniques

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Techniques for writing interpreter loops #3

Description

Available Techniques

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions