-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite get_unboxed_type_representation using a terminating algorithm instead of fuel. #10479
base: trunk
Are you sure you want to change the base?
Conversation
…hm instead of fuel.
This looks promising! I had a look a while ago at tracking the set of paths already expanded, but gave up after discovering the
|
Thanks @stedolan for the excellent comments!
Yes, "we properly write down the termination argument" was left on our TODO list for a few weeks and still is. I plan to steal this task from my intern @nchataing busy sub-list and I will compare with your own notes. We did find it tricky. (Note: the tree structure we mention in the termination proof is related to how we initially found this idea of call-by-name call stacks. The tree structure was more apparent when our call stack was formed of pairs of a head path and its static location in the source program, but then we simplified the definition/implementation by dropping static locations, which are not necessary at runtime. It's interesting that your equi-recursive example below is one where the notion of static location is hard to define.)
This is an excellent find, our algorithm diverges here, which is unpleasant. (Note: I think there is no problem with those equi-recursive types that are accepted without Do you have an idea for how to avoid this issue in a nice way? ("Let it diverge" and "reintroduce fuel for this corner case" are not nice.) One idea would be, when we annotate type expressions with their call stack, to detect equi-recursive cycles and annotate them with a special "poison" stack that gets handled as a cycle as soon as it gets at the head of the type. Ideally we would want to allow expanding once, but not twice; right now I don't know how to express this refinement.
Maybe? (Should we be proud or ashamed of ourselves?) Here is what StackOverflow quotes of the specification:
Apparently there is additional discussion on this point. I hope that we can at least end up with a clearer description of what's going on. This relation is definitely worth mentioning in our abstract anyway, thanks for the pointer!
I don't know the OCaml type-system well-enough to feel confident that it does. We avoided making any claims of completeness in the paper. (Of course we also accept definitions that the previous check rejects because they run out of fuel, but admittedly those are probably fairly artificial.) The answer to this question is also not stable to extensions of the type system. For example, we could envision a form of parametrized types / ad-hoc type families that would allow to define
In your example, the algorithm running on a type of the form Some context: Annotating type nodes is how we implement annotating type subexpressions in our paper description of the algorithm. We found it tricky in some ways (several small bugs caused by traversal operations changing type identities in an unexpected way; for example we are not completely sure yet that the use of The behavior is clear when all syntactic subexpressions in a type correspond to distinct type nodes (then the notion of "type node" and "subexpression" coincide). If there is more sharing in the type, my intuition is that there no problem if the sharing is between type expressions that are non-overlapping in the syntactic presentation. If they are nested in one another, then we have a problem, as your |
I think maybe this can be solved simply: after expanding a given type_expr, update its head annotation to include the path that was expanded. That way, the set of type_exprs not annotated with a given path decreases every time that path is expanded. I think this will only make a difference in the case of equirecursive types. |
This makes sense, thanks! If I understand correctly, it's also equivalent to simply recording a list/set of The "callstack" information for a type expression is introduced at the point where this type expression first appear as a subterm. This new information grows differently, it is introduced at the place where the type expression is called -- appears in head position. (The fact that the update flow is different for both aspects suggests that separating them in two datastructures is sensible.) |
Thinking about it, it looks like the datastructures used here could be simplified. This implementation annotates each So, it looks like the appropriate data structure might be a table mapping type_exprs to booleans, indicating whether they may be expanded further. |
Would this be faster, though? If one expansion introduces (Note: this is not what the paragraph above is about, but I think that performance in practice is in fact quite important for |
I was thinking more about simplicity of data structures than performance! I think it's usually best to first find the simplest data structure for the problem, and come up with optimised versions afterwards. If we are thinking about optimisation, I think there are other changes that would make a bigger difference. It might also be worth considering caching, perhaps by integrating with the abbrev mechanism already present in |
I need to convince myself that the change you propose makes it as easy, or easier, to think about the system (in particular its termination properties). In term of implementation complexity, I think that it is comparable (we are changing the place where a check is made). Re. performance, I agree with your remarks and in fact I was thinking in the same direction:
I think that we want to have (1) as part of this PR, and will probably give up on (2) because it's more work, more invasive, and we are unlikely to see a noticeable performance difference. We had an actually-noticeable performance concern related to get_unboxed_type_representation in the past, namely #8776 . The problem was not with the computation itself, but the fact that it introduced calls to For our work on head shapes (from which the present PR is derived as a "baby" version), we have tried using |
By fiddling a bit with GADTs we were able to find an interesting example where the current algorithm would be incomplete (it cuts the cycle before a precise result can be reached). Consider the computation of the unboxed head representation of type _ t =
| A : bool t -> int t [@unboxed]
| B : string t -> bool t [@unboxed]
| C : string -> string t [@unboxed] Some notes first :
The desired outcome when we call We have an idea to fix this : at each step we store not only the head type path, but also the set of compatible constructors that were considered for this instance. We reject with a cycle only when we stumble upon the same type head with the same constructors. The algorithm would get the correct answer with this modified criterion: Remarks:
|
@stedolan wrote:
This argument is not completely correct, because while I thought of a simpler and more general termination argument, feedback warmly welcome. Consider n-ary tree whose nodes (at any position in the three) can be "expanded", giving a new tree, where the expansion of a node turns its subtree in a different subtree which contains:
OCaml types equipped with the expansion of abbreviations and unboxed types are examples of such trees. Remark: allowing expansion in arbitrary positions (rather than just head position) does not seem necessary for get_unboxed_type_repr, but it makes sense for computing head shapes for constructor unboxing, as it allows for example to reduce the parameter type of each unboxed constructor in a variant type declaration. (Here the syntactic trees considered are a mix of type expressions and instantiated type declarations.) Remark: when a node ('a t) expands, one may think that only possible "old" subtree is the subtree immediately under (t), but in fact with type constraints and unboxed GADT we can have arbitrary descendants, consider type _ t = Foo of ('a * 'a) -> (int -> 'a) t [@@unboxed] Claim: if nodes are equipped with a well-founded order such that a "new" node after expansion is strictly smaller than its expanded node, then the tree expansion process is strongly normalizing. (Induction argument: if all the descendant subtrees of a node are strongly normailizing, and all nodes strictly below it in the node order are strongly normalizing, then it is strongly normalizing.) Our algorithm annotates tree nodes with a "call stack", with new nodes having a larger call stack, and nodes becoming neutral / un-expandable if their head constructor is in their call stack. Callstack extension forms a well-founded order, as callstacks are bounded in size (at most the number of type declarations). As a consequence of the claim, expansion is strongly normalizing. |
My argument was about the representation used in the implementation, where type expressions are graphs, not trees, and expansion preserves sharing. I don't think expansion can ever duplicate a type parameter, and I think annotations are applied to a
This is believable but not obviously true to me! Do you have a more fleshed-out version of this argument? (In particular, the fact that an expansion may generate no new nodes seems to make things tricky). It reminds me of of Goodstein's theorem and the hydra game. |
Ah! It may be easier to control termination on graphs (or explicit substitutions) indeed, I need to think about it. If there was a sensible proof that also worked with (unshared) trees, that would still have my preference. (In particular it makes it easier to explain to other people/communities.)
Let me try. I summarized the argument here:
The proof of this "induction argument" is by induction on the tree. (Note: I'm not considering cyclic graphs here at all, but I believe the argument can be adapted to incorporate the cyclic-graph-detection countermeasure.) Consider an expandable/expansible node in the tree. I defined the possible expansions of a node/subtree as a new subtree whose nodes are either copies of a strict subtree of the expanded node, or "new nodes" that (by assumption) are strictly below it in the node order. By the assumptions of the induction argument, all nodes in the expanded subtree are strongly normalizing, which means that the expanded subtree is strongly normalizing, so the expandable node is itself strongly normalizing (any expansion step gives a strongly normalizing expanded tree). (I don't see the difficulty with the case where the node expands only to one of its descendants, instead of creating new nodes; I assumed that its descendants were all strongly normalizing.) Corollary: any tree is strongly normalizing. The proof goes from the terminal nodes of the tree (who have no children, and therefore are strongly normalizing) rootwards to the root of the tree. |
Thanks for the explanation! I more or less believe this now. I also agree that it would be nice to have a version of the argument that worked on trees. To nitpick the proof a bit, your base case doesn't quite work:
The proof at the terminal nodes is not trivial. You need to show that the property is true not just of their subtrees (easy, vacuously true), but also of all nodes strictly below them in the well-founded order. The nodes below terminal nodes need not be terminal: indeed, you can have an arbitrarily large node below a terminal node (e.g. So I don't think you can do this just by induction on the tree, but I suspect the argument you provided works if used with a better induction principle (something like a nested induction first on the well-founded order and then on trees?) |
You are correct, the "corollary" part does not work as-is. I thought that this was a simple nested induction, but I didn't find an easy way to nest the inductions. In fact, the idea of trying to do tree induction is fragile in general (if a tree T expands through a subnode S rather than a head expansion, it's hard to control the effect of the expansion on the whole tree.) Below is a different argument, considering reduction sequences directly. What do you think? First a remark: when I say that there is a well-founded ordering of nodes, I mean that there exists a well-founded measure on the nodes that decreases for the new nodes of an expansion. Technically it's a well-quasi-order on nodes, as it is not antisymmetric. We can get a (well-quasi-ordering) measures on trees, rather than nodes, by considering for each tree the set of measures of its nodes, using the standard lifting of well-quasi-order on sets: When a tree Strong normalization proof: Consider a reduction sequence of trees. If a reduction from |
I don't think that proof fully works either!
Every node in T' is either new (and therefore below the expanded node in node order), or part of T. This is enough to establish T' <= T. But to get T' < T we also need Concretely, suppose that T consists of a node |
Ouch! Back to the drawing board... |
I'm back with a powered-up variant of my first proof attempt (considering the set of node measures as a tree measure). The proof is not really/only mine, I complained about this problem to my partner over lunchtime and she suggested this. (If we could invite Nicolas and Stephen over for dinner we would solve this real quick!) The idea is to measure a node within a tree by its "node-path measure", which is the multiset of measures of the nodes on its path to the root (including the node itself). Then we measure a tree by the multiset of the node-path measures of its nodes. Multisets of well-quasi-ordered elements are well-quasi-ordered by the Multiset order. (The order is traced back to Dershowitz–Manna in 1979 and Huet-Oppen in 1980. It's fitting to use a notion co-invented by Gérard Huet for an OCaml problem.) It is the extension of set ordering that is stable to multiset addition: Now if T expands to T', we want to prove that In general a path to a node in the expanded subtree looks like For example (using simple lists as trees of one-children nodes):
We have Note that the argument works even if there is duplication of subtrees. For example, if In @stedolan's previous example, Bonus remark: from a medium distance this looks related to the Kruskal tree theorem, which defines a well-quasi-ordering on trees |
@stedolan would you by chance have an opinion on the last version of the proof? |
I read it, went off to read some more about wqos, then got distracted. I've now actually read and understood the proof, and it looks correct to me! For anyone else following along, here's my understanding of the proof: in the Dershowitz-Manna ordering, you can make a multiset smaller by removing an element and replacing it with any finite collection of elements, as long as each new element is smaller than the one you removed. This ordering is well-founded if the ordering on elements is. This process describes both the action of expansion on individual paths and on trees viewed as multisets of paths, so expansion is decreasing in the well-founded order of multisets of multisets of node labels. Pointless nitpick: the references I could find described the multiset order as well-founded. Do you know whether it is a wqo as well? I think it probably is, but I couldn't find a proof of this anywhere. (The argument here does not depend on wqo vs. wf, this is just for my own curiosity) |
Thanks! Regarding wo/wqo, I'm not order theoretician but my intuition is that a well-quasi-order structure on a set A, |
I think we may be using different definitions for things! Here's what I mean by various terms: Suppose A is a set equipped with a preorder (reflexive/transitive relation) <=, and we define a < b as a <= b & b </=a. Then the ordering is:
The wqo condition is a weaker version of being a total order: with just two elements, you're not guaranteed that a <= b or b <= a, but with infinitely many elements you're guaranteed at least one related pair. Componentwise ordering on ℕ * ℕ is wqo but not a well-order, and equality on ℕ is well-founded but not a wqo. After re-reading earlier comments, I think what you're calling a wqo might be what I call a well-founded preorder. The Dershowitz-Manna construction builds a well-founded preorder on multisets out of well-founded preorder on elements, but I'm not sure that it preserves the no-infinite-antichains property, thus by the definition here I'm not sure it maps wqos to wqos. (However, for the argument here well-foundedness is enough, so whether or not it is actually a wqo doesn't affect the proof). |
I used "well-ordered" by mistake above, I meant "with a well-founded order". (We never work with total orders.) I used quasi-order and preorder interchangeably (I didn't know of quasi-order before this PR, but found the suggestion on wikipedia), so to me they meant "not necessarily antisymmetric", so indeed that corresponds to your notion of well-founded preorder. (It looks like this was a mistake, with a difference between wqo and wpo coming from subtle differences between descending chains and antichains; oh well...) |
My general impression is that the termination question is now fully sorted out. I think that several important questions remain for this PR:
Points (2) and (3) are prerequisites to reach a "mergeable state" for this PR. But I think that (1) is the more important question, and I would expect that it can already be discussed and possibly decided, before @nchataing and myself invest non-trivial amounts of additional work in (2) and (3). Do we believe that getting rid of fuel is worth the complexity of a new termination-control mechanism? I wonder if people who already looked at this PR (@lpw25, @stedolan) have an opinion, and others as well. (Maybe it was not strategic from this feedback-attracting perspective to discuss at length the termination proof first, but oh well, it was fun to crowd-implement a proof instead of a program for once.) |
Personally, I don't think removing fuel is worth the effort that has been expended on creating and justifying this implementation, since I suspect that in practice no one will ever actually observe the difference. However, the code itself is pretty simple and doesn't seem hard to maintain, so we might as well have it now its written. |
type t = | ||
| Unavailable | ||
| This of type_expr | ||
| Only_on_64_bits of type_expr | ||
|
||
let check_annotated ty callstacks = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function just checking that extend_callstacks
works, using logic almost identical to extend_callstacks
? If so, I'm not sure it is gaining you anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this function is used for debugging only. It proved tricky to get an implementation right because of the reliance on type node identity, interleaved with expansions (in particular the call to expand_head_opt
, which transforms the type graph). It will be possible to remove it if people agree on a final implementation.
We discussed this PR at a maintainer meeting today. Here is what I understood of the consensus:
|
Some news on this topic -- rather exploratory in nature, I'm afraid.
The termination algorithme we presented above is "online", in the sense that it monitors the computation, the unfolding of definitions, and is guaranteed to stop it after a finite number of step if the unfolding process would loop indefinitely. In contrast, the simple termination proofs I am receiving pointers to (for example this one) have a rather different computational content, they perform a more "global" analysis/computations on the whole set of recursive definitions at once to decide which type definitions "introduce non-termination". (They still need to unfold definitions / perform beta-reductions, so they are "dynamic" in nature as well.) There are two issues with this for the question at hand:
I am surprised by the complexity gap I perceive between the "properties of type definition", for which termination is rather simple (the subtlety is in defining the property we want), and "online monitoring during expansion", for which termination required some hard thinking. |
This PR proposes to rewrite
get_unboxed_type_representation
inTypedecl_unboxed
, based on the algorithm presented in http://gallium.inria.fr/~scherer/research/constructor-unboxing/constructor-unboxing-ml-workshop-2021.pdf.Following #10337, we were obliged to redefine the
mem
function fromBtype.TypeMap
andBtype.TypeHash
to incorporate therepr
normalisation. This is maybe something that we want to generalize from said modules?Also we were unsure how to test this function properly. The function principal use is to later compute the immediacy of a type. Maybe the testsuite related to immediacy suffices in this case?
With @gasche.