Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extend Lambda.subst on bound variables, add Lambda.duplicate #9216

Merged
merged 1 commit into from Jul 2, 2020

Conversation

gasche
Copy link
Member

@gasche gasche commented Dec 28, 2019

It is invalid to reuse a Lambda.t term twice, because bound variables
may be used non-uniquely. If we want to perform a code transformation
may duplicate subterms in some cases, we have to refresh all bound
variables of the copied subterm.

The present PR implements a function

Lambda.duplicate : lambda -> lambda

that does exactly this. It is implemented by making Lambda.subst
parametrized over a transformation on bound variables.

@gasche
Copy link
Member Author

gasche commented Dec 28, 2019

(cc @chambart @trefis, who worked on similar questions somewhat recently)

The particular optimization whose study resulted in this PR is TRMC (#181), which generates several copies of a trmc-ed function (typically: one with the normal interface, and one or several in destination-passing style). In absence of such a function, this creates tricky bugs.

@gasche
Copy link
Member Author

gasche commented Jan 4, 2020

(Fixed check-typo.)

lambda/lambda.ml Outdated
let lev_env =
Ident.Map.fold (fun id _ env ->
match Env.find_value (Path.Pident id) evt.lev_env with
| exception Not_found -> env
| vd -> update_env id vd env
) s evt.lev_env
) s lev_env
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chambart just asked me why there are two separate environments s (for free variables of the original term) and l (for bound variables traversed so far); the two environments are handled differently here: for free variables substituted by the user, we use the user-provided function update_env; for bound variables substituted in l, I do the correct thing which is to change the binding to the new bound variable name. (Ideally I would also remove the old binding to the substituted-away variable, but there is no API to remove a binding from debug environments.)

lambda/lambda.ml Outdated Show resolved Hide resolved
lambda/lambda.ml Outdated Show resolved Hide resolved
| Lfunction lf ->
let s =
List.fold_right
(fun (id, _) s -> Ident.Map.remove id s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove that ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove are removing the bound variables from the substitution, so that only free variables remain in the domain of the substitution. My idea was that having two environments of free and bound variables obviates this: I can always tell when something is a bound variable (would, before, have been removed from the map) by looking in l.

In the Tvar case I lookup in the bound-variable environment first, and only then in the free-variable environment. This means that there is no lookup in s for bound variables, so the remove would make no difference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with omitting the removal, but I'm worried that the code was present in the first place. Could you maybe assert that the parameters do not occur in s ? I would consider it an error if subst was called with a substitution for variables that are bound in the term, and also a problem if a variable is both free and bound in the term.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, supporting the odd situations you mention was an explicit addition of #1606, supposedly for the purpose of "a (later) patch about let-rec compilation". I have cold feet about restricting the specification of the function again; I don't need it for my implementation, and I don't want to be responsible for potential breakages that could occur.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I can see how you could end up in those odd situations if you inline a recursive definition within itself without freshening its bound identifiers. Interesting, the present PR provides an easy way to freshen those bound identifiers. In any case, I think that any such considerations could be left for another PR, possibly by someone else.)

@gasche
Copy link
Member Author

gasche commented Mar 31, 2020

I rebased the PR to address @chambart's comments. (If he or others would like to review, please do!)

@gasche
Copy link
Member Author

gasche commented Apr 29, 2020

I'm still looking for someone interested in reviewing or discussing this. (cc @chambart, but other people are warmly welcome.)

@gasche
Copy link
Member Author

gasche commented Jun 8, 2020

Still looking for a review.

Copy link
Contributor

@lthls lthls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this PR, which does what it is advertising to do in a generally pleasing way.

I've made some remarks, all of those that start with "Style:" can be completely ignored if you want. For the others I wouldn't mind having some kind of answer, but they do not affect the correctness of the code.

lambda/lambda.ml Outdated Show resolved Hide resolved
lambda/lambda.ml Outdated Show resolved Hide resolved
| Lfunction lf ->
let s =
List.fold_right
(fun (id, _) s -> Ident.Map.remove id s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with omitting the removal, but I'm worried that the code was present in the first place. Could you maybe assert that the parameters do not occur in s ? I would consider it an error if subst was called with a substitution for variables that are bound in the term, and also a problem if a variable is both free and bound in the term.

lambda/lambda.ml Outdated Show resolved Hide resolved
lambda/lambda.ml Outdated Show resolved Hide resolved
lambda/lambda.ml Show resolved Hide resolved
lambda/lambda.ml Outdated Show resolved Hide resolved
lambda/lambda.ml Outdated Show resolved Hide resolved
lambda/lambda.mli Outdated Show resolved Hide resolved
lambda/lambda.mli Outdated Show resolved Hide resolved
@gasche
Copy link
Member Author

gasche commented Jun 8, 2020

Thanks for the reviews! I pushed some commits to address most of @lthls points (I will squash them later). I still have to look at some things, and take into account @maranget's point on the environment. (I resolved all comments on which I am confident that no further discussion is useful.)

It is invalid to reuse a Lambda.t term twice, because bound variables
may be used non-uniquely. If we want to perform a code transformation
may duplicate subterms in some cases, we have to refresh all bound
variables of the copied subterm.

The present PR implements a function

    Lambda.duplicate : lambda -> lambda

that does exactly this. It is implemented by making Lambda.subst
parametrized over a transformation on bound variables.
@gasche
Copy link
Member Author

gasche commented Jun 10, 2020

@lthls I don't know if you would like to do another round of feedback on this one, or if I should get in the mindset of merging this eventually.

@lthls
Copy link
Contributor

lthls commented Jun 10, 2020

Apart from the Levent code that I haven't followed closely, every change you've made is in the right direction for me, so my approval still stands.

@gasche
Copy link
Member Author

gasche commented Jun 10, 2020

Let's talk about Levent just a bit.

Before this PR, the code would traverse the summary, and call update_env for each variable that would be part of the free-variable substitution. (In practice this is used by Lambda.rename which performs variable->variable renaming, and then the binding in the summary would be duplicated for the renamed variable as well (we don't know how to remove the old one, and we don't try to shadow it).)

What the PR does now (it did something less correct before, which I noticed and fixed after the question by @maranget) is to do a merge on the free-variable substitution and the bound-variable map, to decide for each binding in the union how to update the environment:

  • if the variable is bound, we use the same strategy as in rename (duplicate the binding to the new identifier)
  • if the variable is part of the free substitution (and not bound), we call update_env
  • otherwise we change nothing

@gasche gasche merged commit 50806ce into ocaml:trunk Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants