Poor compilation of dependent pattern-matching in bytecode #18933

silene · 2024-04-15T16:31:46Z

Is your feature request related to a problem?

A commonly encountered idiom in Coq is

match x return (P x -> _)  with
| C1 => fun H: P C1 => ...
| C2 => fun H: P C2 => ...
| ...
end H (* <- note the application *)

The way it is compiled to bytecode is rather inefficient. First, H is pushed on the stack, then a continuation label is pushed too. Then, the pattern matching is performed. Every branch allocates a closure and fill it with the needed free variables. Then it pops the label and jumps to it. Finally, the closure is applied to the element that was pushed on the stack.

To summarize, during evaluation, a closure is allocated and immediately consumed.

Proposed solution

Ideally, no closure would be allocated and applied, since the branches can be directly executed, which would bring a large speedup.

There is a snag with open terms, though. Indeed, during strong normalization, the branches need to actually return closures. In other words, for normalization, we still need the bytecode as it currently is, which means that a different bytecode would be executed during evaluation and during normalization. Moreover, no sharing would exist between both bytecodes, since the one for evaluation would access the free variables directly on the stack (ACC opcode), while the one for normalization would access the free variables from the closure environment (ENVACC opcode). So, it would cause some bytecode duplication (and an exponential blowup in case of nested matches). It is not obvious if a well-thought prefix that would copy the closure environment on the stack would make it possible to delegate part of the normalization bytecode to the evaluation bytecode.

Alternative solutions

No response

Additional context

This is remotely related to #14109.

The text was updated successfully, but these errors were encountered:

ppedrot · 2024-04-17T13:14:41Z

The problem is not limited to this kind of commuting closure. More generally, in the Coq VM any non-tailcall pattern-matching has to push a dummy function frame due to the MAKESWITCHBLOCK instruction, and the branches perform a RETURN to that frame instead of jumping to the context. Said otherwise, the bytecode of a non-tailcall match looks like

PUSHRETADDR; SWITCH[table];
lbl_accu: MAKESWITCHBLOCK; RETURN 0;
lbl_arg1: (* branch1 *); RETURN nargs1;
...
lbl_argn: (* branchn *); RETURN nargsn

This is in stark contrast to the OCaml VM, where there is no PUSHRETADDR and thus no function frame, and the RETURN operations are just unconditional JUMPs to the continuation. Once again this scheme is designed to handle accumulators but has a non-zero cost for all users.

silene · 2024-04-17T13:54:00Z

Note that PUSHRETADDR and RETURN are opcodes that do not perform any allocation. So, yes, Coq's VM is executing two opcodes instead of just one for the original OCaml's VM, but I consider the overhead to be sufficiently low that it is not worth wasting time on it. In fact, if the selected branch contains a let in, then the OCaml's VM will also be executing two opcodes, in which case there is not even an overhead from Coq's VM.

In contrast, the issue I mention is a case where a bunch of costly opcodes are being executed for naught.

silene added kind: performance Improvements to performance and efficiency. part: VM Virtual machine. kind: wish Feature or enhancement requests. labels Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor compilation of dependent pattern-matching in bytecode #18933

Poor compilation of dependent pattern-matching in bytecode #18933

silene commented Apr 15, 2024

ppedrot commented Apr 17, 2024

silene commented Apr 17, 2024

Poor compilation of dependent pattern-matching in bytecode #18933

Poor compilation of dependent pattern-matching in bytecode #18933

Comments

silene commented Apr 15, 2024

Is your feature request related to a problem?

Proposed solution

Alternative solutions

Additional context

ppedrot commented Apr 17, 2024

silene commented Apr 17, 2024