Lineiform

Lineiform is a meta-JIT library for Rust interpreters. Given an interpreter that uses closure generation, it allows an author to add minimal annotations and calls into Lineiform in order to automagically get an optimizing method JIT, via runtime function inlining and constant propagation to resolve dynamic calls to closed environment members. Internally it does lifting of closure bodies from x86 assembly to Cranelift IR for dynamic recompilation.

The Big Idea

(There's a blog post now).

A normal tree-walking interpreter in Rust looks like this: (pseudocode but not much)

fn eval(ast: Ast) -> usize {
    match ast {
        Literal(l) => l,
        Add(left, right) => eval(left) + eval(right),
    }
}

let val = eval(Add(Literal(1), Literal(2)));
val // 3

You can instead write a closure generator interpreter like this, for the same language: (also pseudocode but not much)

type Environment = ();
type EvalAst = Fn(Environment) -> usize;
fn eval(ast: Ast) -> EvalAst {
    match ast {
        Literal(l) => |e: Environment| l,
        Add(_left, _right) => {
            let left = eval(_left);
            let right = eval(_right);
            |e: Environment| { left(e) + right(e) }
    }
}

let compiled_eval = eval(Add(Literal(1), Literal(2)))
let env: Environment = ();
compiled_eval(env) // 3

compiled_eval look like [ return [ return 1] + [ return 2] ] where [] is its own closure. The Add case of the closure in particular has a closed environment of left and right which are a specialized closure for the AST node that they're made of - it's just that you still need a dynamic call for evaluating them. Lineiform is a library that you'd call (probably only in a Call language construct, for method-level optimization) to optimize an EvalAst closure, for example inlining those left and right dynamic calls into the single function body of compiled_eval, since the closed environment for an EvalAst is statically known at that point. compiled_eval then become [ return 1 + 2 ], which you can then optimize further to just [ return 3] - it "unrolled" multiple interpreter loops into one function, exposing them to a recompiler for global optimization of the actual evaluation of the closure. We don't fix the Environment that is actually passed to the closure, only the closed environment.

You can think of this analogous to how a Forth compiler works: 1 is a word that pushes the value 1, 2 is a word that pushes the value 2. A dumb value of 1 2 + then is three dynamic function calls, to function body addresses stored in its "contents" list (1, 2, +). If you have a concretely built word at runtime, however, you can inline the calls to all the function pointers you have, and get the actual execution path of the function, allowing you to do global optimizations. The Rust closure compiled_eval looks much like this Forth word if you squint, but doesn't use a threaded interpreter execution mode, and stores the function pointers in the closure closed environment instead of a word body array.

Lifting x86

Lineiform uses a technique called dynamic recompilation, which is used extensively in cybersecurity libraries like DynamoRIO and emulators like QEMU: by re-encoding the semantics of x86 assembly as some internal IR ("lifting"), you can treat the IR much like the IR you'd build from a normal compiler frontend. This allows you to run normal optimization passes over the IR and compile it back down to assembly again ("lowering").

Lineiform uses Cranelift's IR as its lifting target, allowing us to use its existing infrastructure to reallocate registers after our function inlining and JIT compile back to x86. This isn't really what it was built for, tbh, so it doesn't do all that great of a job currently.

What works

You can create a JIT and feed it a closure, which does some constant propagation from the closure's environment in order to turn tailcalls and dynamic function calls into inlined basic blocks (and inline any constants that were closed over, for example) and call the resulting "optimized" closure. The x86 lifting isn't very featureful - I've implemented register aliasing, memory load/store widths, condition flags and some branching - but I haven't done any verification that any operations are correct w.r.t. actual instruction execution.

What doesn't

The main missing lifting features are 1) it never merges divergant flow control back together 2) it doesn't handle loops at all 3) all operations are (currently) zero extended, instead of being zero or sign extended depending on the instruction. And just a whole bunch of instructions, of course, since I've mostly only been implementing exactly as much as I require to pass each test I add. C'est la vie.

Warning

This is extremely unsafe and under construction. It will silently miscompile closures so that they return the wrong values and/or eat your socks. The entire idea of using dynamic recompilation for a JIT is ill informed and foolish; the author is the epitome of hubris. There are dragons contained within.

No license or warranty is provided. If you want to use it and care about licensing things, get in touch on Twitter.

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
lineiform		lineiform
src		src
tangle		tangle
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lineiform

lineiform

src

src

tangle

tangle

.gitignore

.gitignore

Cargo.toml

Cargo.toml

README.md

README.md

Repository files navigation

Lineiform

The Big Idea

Lifting x86

What works

What doesn't

Warning

TODO list

About

Releases

Packages

Languages

chc4/lineiform

Folders and files

Latest commit

History

Repository files navigation

Lineiform

The Big Idea

Lifting x86

What works

What doesn't

Warning

TODO list

About

Topics

Resources

Stars

Watchers

Forks

Languages