Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

At tailcall #111

Closed
wants to merge 4 commits into from
Closed

At tailcall #111

wants to merge 4 commits into from

Conversation

c-cube
Copy link
Contributor

@c-cube c-cube commented Oct 29, 2014

Add an attribute [@tailcall] such that f args [@tailcall] warns (with the new warning 50) if the call is not tail-recursive. See the ocaml hacking item.

The patch is twofold:

  • A Lapply expression (in lambda code) is annotated with a record, rather than a location,
    with a boolean within this record that specifies whether the call is expected to be in tail position. This flag
    is set if the corresponding Typedtree.expr has a "tailcall" attribute.
  • In Simplif, if the warning is enabled, tail calls are annotated, and if a call is expected to be in tail position
    but isnt't, the warning is emitted.

Also added two tests in testsuite/tests/warnings/.

@yallop
Copy link
Member

yallop commented Oct 29, 2014

/cc @euanh

@euanh
Copy link

euanh commented Oct 30, 2014

/cc @simonjbeaumont

@simonjbeaumont
Copy link

Cool! I've also had a stab at the same problem. I also found that doing this in the compiler would mean threading through the annotations into the Lambda layer (passed the Transl stage).

It seemed like that was a clear separation from the Typedtree to the Lambda so instead I wrote a cmt parser:

https://github.com/simonjbeaumont/ocaml-tailrec

It can be compiled with make and has two more targets: make test_pass and make test_fail. It is a bit of a proof of concept at the moment. Sadly the extension point ppx strategy didn't suffice since it only gives you a hook at the untyped AST stage.

Cool to see this in the compiler though. I'd hacked up a similar approach but not quite made it.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 30, 2014

A ppx strategy would be useful to implement [@tailrec] in terms of [@tailcall] (as suggested there).

It's interesting to notice that this PR uses the function Simplif.emit_tail_infos, that used to be called only when generating .cmt files, to check whether annotated call sites are indeed in tail position.

@@ -516,7 +516,9 @@ let rec emit_tail_infos is_tail lambda =
match lambda with
| Lvar _ -> ()
| Lconst _ -> ()
| Lapply (func, l, loc) ->
| Lapply (func, l, ({apply_loc=loc} as info)) ->
if info.apply_should_be_tailcall && not is_tail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should test here if the warning is enabled. emit_tail_info, is allso called when -annot is used.

@chambart
Copy link
Contributor

You should enable this warning by default since it won't warn if you didn't add annotations explicitely.

| Lapply (func, l, loc) ->
| Lapply (func, l, ({apply_loc=loc} as info)) ->
if info.apply_should_be_tailcall && not is_tail
then Location.prerr_warning loc Warnings.Expect_tailcall;
list_emit_tail_infos false l;
Stypes.record (Stypes.An_call (loc, call_kind l))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should test if -annot is set before emitting informations

@simonjbeaumont
Copy link

First off, hats off to @c-cube! I'm pleased I wasn't the only one who thought it might be useful.

I think @thomassa makes a good point. I found that the natural way of annotating the final line of with [@tailcall] you end up capturing a tail-call that is not the application of the function you want to assert is tail-recursive.

That was partly why I abandoned that approach and went instead for annotating a function definition as [@tailrec] (cf. annotating the call-site). Then I checked that all applications of this function within the body were tail-calls which felt a little better.

Is it not also a little suspect to thread through the annotation from the TypedTree to Lambda for just the Lapply. It kind of feels like annotations should exist in the Lambda stage or not (not just for one of the variants).

@lpw25
Copy link
Contributor

lpw25 commented Oct 30, 2014

Why not just add support for f [@tailcall] x y? Being able to accurately annotate function calls seems useful for quite a few things, and it looks quite simple to add this case. Note that, for consistency, the attribute should be attached to f rather than f x y in the AST since that is what f [@foo] means in other contexts.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 30, 2014

@lpw25 I guess a Lapply (f, args, _) should also be tailcall if f has been translated from a [@tailcall]'ed expression, so that f [@tailcall] a b c works. I think this version is more readable too, especially w.r.t. parenthesing.

Otoh I don't think ditching [@tailcall] in favor of [@tailrec] is enough. Both can be useful. Most of the time [@tailrec] is handy, but it can be expressed (with a ppx?) in terms of [@tailcall]. For some other uses -- I have in mind continuation-passing style -- more flexibility is needed.

This CPS example makes me wonder which syntax would be nice for expressing the constraint that in

let f x y z cont = ...

the continuation cont should always be called in tail-position, precisely not to eat stack space. This can also be expressed in terms of [@tailcall] but I don't know how to express the higher-level abstraction.

@lpw25
Copy link
Contributor

lpw25 commented Oct 30, 2014

@lpw25 I guess a Lapply (f, args, _) should also be tailcall if f has been translated from a [@tailcall]'ed expression, so that f [@tailcall] a b c works. I think this version is more readable too, especially w.r.t. parenthesing.

Yes, and you need to add support for f [@foo] x y to the parser. This probably just means changing | simple_expr simple_labeled_expr_list to | simple_expr attribute simple_labeled_expr_list in expr, but I haven't checked it properly for conflicts.

@gasche
Copy link
Member

gasche commented Oct 30, 2014

I think we should make a choice between either fact (n-1) [@tailcall] or (fact [@tailcall]) (n-1), rather than support both. I'm personally in favor of (fact [@tailcall]) (n-1), which I find more natural and robust, and would support enabling the id [@attribute] syntax to be used as applicative-left-hand-side (but that should be a separate patch).

@yallop
Copy link
Member

yallop commented Oct 30, 2014

Ideally I think we'd have a prefix attribute: [^tailcall] fact (n - 1), since tailcall behaves rather like a built-in operation. Perhaps fact [@tailcall] (n-1) is the next best thing.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 30, 2014

It looks like supporting fact [@tailcall] (n-1) does introduce conflicts, sadly. For now could we settle on the uglier (fact [@tailcall]) (n-1), which at least is robust, and deal with parser issues separatly?

@pw374
Copy link

pw374 commented Oct 30, 2014

To me it'd make much more sense if [@tailcall] had to be prefix. o_O

@c-cube
Copy link
Contributor Author

c-cube commented Oct 30, 2014

@pw374 that would make sense, I agree. A solution could be to allow [%tailcall fact (n-1)] and then rewrite this, internally, into (fact (n-1) [@tailcall])... At least it looks like "tailcall" is a primitive operation.

@lpw25
Copy link
Contributor

lpw25 commented Oct 30, 2014

It looks like supporting fact @tailcall does introduce conflicts, sadly.

For the record, I believe this conflict is artificial and could be avoided by replacing the rule:

| expr attribute

in the definition of expr by an attributes on the end of every rule in expr. However, this is quite a bit of work, so may not be worth the effort. Also the use of various precedences in the expression grammar makes it difficult to judge the correctness of this transformation.

@c-cube
Copy link
Contributor Author

c-cube commented Oct 30, 2014

Just added a preprocessing phase to compilation so that [%tailcall foo] becomes foo [@tailcall]. This phase will also be useful for [@tailrec] or other annotations that can rely on [@tailcall].

I had issues with the build system, so there are almost surely lots of mistakes.

@c-cube c-cube force-pushed the at_tailcall branch 2 times, most recently from 9186116 to 594d444 Compare October 30, 2014 20:36
@gasche
Copy link
Member

gasche commented Oct 31, 2014

I'm not happy with [%tailcall ]. Extensions and attributes have a clear semantics: extensions is for what is not valid OCaml code, attributes are to add cross-cutting concerns to valid OCaml code. tailcall is an attribute, not an extension.

Please:

  • stop trying to abuse the extension mechanism to your particular aesthetic taste. The very point of extensions is to have a more robust structure than Camlp4 at the cost of a heavier syntax, and repeatedly turning this into a steaming pile of hacks will not help in the long-term
  • pick one syntax for the tailcall annotation that is relatively consensual and use it. Not two syntaxes that rewrite to each other, or to avoid making a choice.

Syntax design is hard. The solution is not to allow everything.

(We can discuss the conflict issue later. Maybe careful massaging of the grammar would let us have the nice things. We can try that later if the chosen syntax for the attribute leds itself to this extension without introducing two separate syntaxes.)

@c-cube
Copy link
Contributor Author

c-cube commented Oct 31, 2014

@gasche: ok, fine. I'll get back to f x y z [@tailcall] then, it's less invasive to the compiler and parser, and is quite intuitive. The preprocessing phase could still be useful for encoding [@tailrec] and the likes (annotate some function argument with [@continuation]? Would be quite useful in my code actually!) in terms of [@tailcall], but that's your choice.

About f [@tailcall] x y z, my opinion is that it reads a bit weird. What should be the meaning of f x [@tailcall] y z? Would it be (f x) [@tailcall] y z? It does look like a syntactic hack to me. People can write parenthesis when required.

@Drup
Copy link
Contributor

Drup commented Oct 31, 2014

I think it reads much less weird if you format it f[@tailcall] x y z. The attributes is attached to the previous expression, and that's pretty much it. I think it's quite simple. We all know whitespace insensitiveness can be weird at times, it's not an especially surprising example.

@thomassa
Copy link

@c-cube wrote:

What should be the meaning of f x [@tailcall] y z? Would it be (f x) [@tailcall] y z?

If it were valid then that's how I would read it, certainly: as saying that there is a tail-call of the function that results from applying f to x. I think it amounts to the same thing though: if we think of f a1 a2 a3 a4 as (((f a1) a2) a3) a4 then it starts looking as if the last call is not to f (which just acts on a1 to return a function which is then applied to a2) but rather to the anonymous function returned from (((f a1) a2) a3)... but it would be silly to have to write f a1 a2 a3 [@tailcall] a4.

@yallop: I agree.

@c-cube wrote:

This CPS example makes me wonder which syntax would be nice for expressing the constraint that in
let f x y z cont = ...
the continuation cont should always be called in tail-position

I agree that it would be nice to have some way to express that sort of constraint.
What exactly should the constraint allow though? Should function f be allowed to pass cont to some other function g instead of calling cont within f? If so, should g have to be in tail-position when called from f? When g receives cont, should the constraint apply there as well? I start to speculate vaguely about the notion of a special type or variety of function: one that can be called only as a tail-call (and passed as an argument only in a tail-call? Tricky...) (And perhaps a way to obtain a tail-only function from any function? Something with the feel of let tc_cont = tailonly cont in ...)

@gasche
Copy link
Member

gasche commented Oct 31, 2014

The CPS example is interesting because it does not seem to differ much from the proposed semantics of [@tailrec], despite being attached to a function argument rather than the function itself. It suggests that we may want to allow [@tailcall] to be attached to binding sites (rather than just call-sites) with the following semantics: all use-sites for this variable must be tail-calls.

This immediately suggests having an annotation [@nontail] (do we want to use [@tail] rather than [@tailcall]?) to explicitly mark some call-sites as "not tail-call but this is intentional and don't warn here despite the variable is bound with [@tail(call)]". This happens in practice with partially-CPS code on non-recursive data structures.

Note that there is still a difference between this proposed semantics of "[@tail] on binders" and the semantics of [@tailrec], as [@tailrec] would be applied to a definition-binding (let, with both a definition-body and a use-body) rater than abstraction-bindings (fun, only a use-body). The natural extension for [@tailcall] in definition-bindings is to warn about non-tail calls in the definition body and in the use body, and [@tailrec] behaves differently (it scopes exactly on the mutual recursion block but does not warn about non-tail uses outside it). I considered suggesting that this should always be the semantics of [@tailcall] on definition bindings, but keeping the warning on both definiton and use-sites is also a useful feature, consider code such as the following (not unusual):

let rec (f [@tailrec]) env param =
  let (f' [@tail]) = f env param in
  function
  | ... -> ... (f' x)
  | ... -> ... (f' y) ...
  | ... -> ... (f env' param z)

The call to f inside f' is in tail-position, but that doesn't help much if f' itself isn't also called in tail-position, which the [@tail(call)] annotation on its binding site concisely ensures -- with a semantics different from [@tailrec].

The preprocessing phase could still be useful for encoding [@tailrec] and the likes.

On the side of the implementation, it is indeed a reasonable idea to "implement" the [@tailrec] semantics by weaving additional [@tailcall] annotations through the code (on all call-sites in the mutual recursion block that are not already marked [@nontail]). (But it should be done as a post-preprocessing phase rather than at the same time as -ppx to avoid polluting the -dsource output). This weaving can be done using Ast_mapper, that is the same machinery used by ppx extensions. I'm not opposed to that, but this is strictly an implementation device (and maybe not the only one¹) rather than a user-exposed additional syntax.

¹: one problem with implementing [@tailrec] or [@tail(call)]-on-binders with Ast_mapper is that you would have to maintain an environment to avoid variable capture (a local let could hide the recursive function name and its own use should not be marked [@tailcall]), duplicating some work that (1) is done by the type-checker later on (2) could be done directly in simplif.ml to avoid having an additional traversal pass (done more efficiently as you already have the shadowing-avoiding unique names generated by the type-checker).

@gasche
Copy link
Member

gasche commented Oct 31, 2014

@yallop: indeed, prefix attributes are a reasonable idea -- and it can generalize to other use-case. That said, I'm not convinced it is necessary in this case, and I'm afraid the extension/attribute syntax is quite complex already (@, @@, @@@...), so I'm not sure we want to add yet another dimension of complexity. In any case, if people agree on (f [@tailcall]), it means we can always switch to ([^tailcall] f) -- they mean the same AST.

@alainfrisch
Copy link
Contributor

In the same vein as:

begin[@foo] f end

it had been discussed to support:

([@foo] f)

but this was rejected during the discussion. I'd still prefer this form over using yet another different symbol.

@thomassa
Copy link

@alainfrisch: Oh, I hadn't realised there was anywhere begin ... end could not be written as ( ... ).

@gasche: your idea of attaching the attribute to the binding-site is interesting, but I'm not clear on whether or not it would allow something like this:

let g t =
  ...
  t args

let f (t [@tail]) more_args =
  ...
  h (g t)

The significant point there is that although when t is called (in g) it is in tail-position, f's call to g is not a tail-call.
I think ideally we'd like it to forbid that, but allow f to end with g t since this has no stack-space cost.

@damiendoligez
Copy link
Member

Looks good to me.

@c-cube
Copy link
Contributor Author

c-cube commented Mar 16, 2015

ping @gasche for merging? ;)

@gasche
Copy link
Member

gasche commented Apr 12, 2015

Merged in trunk. Thanks for the patch and the compromises!

I remember at the OCaml coding sprint when we discussed the code, I was unconvinced by your choice of extending the loc parameter into a "record of non-important stuff", and discussed turning the whole payload of the Lapply constructor into a record instead. Now that #133 has been merged, I even more strongly think that this would have been the right choice. That said, the code works as is, and I don't have the time to rework it with the other representation. Anyone with the same impression should feel free to propose a patch to use a record for all parameters of the Lapply case.

@gasche gasche closed this Apr 12, 2015
@Drup
Copy link
Contributor

Drup commented Apr 14, 2015

@gasche I don't see this in trunk.

@Drup
Copy link
Contributor

Drup commented Apr 15, 2015

Never mind. Apparently the git mirror is lagging behind.

@bobzhang
Copy link
Member

bobzhang commented Jul 9, 2015

I have a dumb question, what is the use case for tailcall, is not it obvious by reading the code to see if it's tailcall. this seems to compilicate lambda intermediate representation for not too much benefit?

@pw374
Copy link

pw374 commented Jul 9, 2015

Sometimes one may write a tail call, then their code evolves and the call isn't a tail one any more. Typically one write a tail-recursive function, and then adds something to handle some exception, which wrecks the tail position.

On 09 Jul 2015, at 23:25, Hongbo Zhang notifications@github.com wrote:

I have a dumb question, what is the use case for tailcall, is not it obvious by reading the code to see if it's tailcall. this seems to compilicate lambda intermediate representation for not too much benefit?


Reply to this email directly or view it on GitHub.

@thomassa
Copy link

@bobzhang In practice, such mistakes are common enough that many people feel the feature is worthwhile.

@bobzhang
Copy link
Member

I agree it can be useful in some cases. but the implementation is not very modular, shall we design what kind of information should be passed down to lambda IL, if tomorrow we have another useful feature needed, shall we bloat lambda IL more?

btw, for this feature, I think [@tailrec] is more meaningful, I just annotate functions starting with let rec, instead of annotating every call site.

@gasche
Copy link
Member

gasche commented Jul 10, 2015

I don't see a problem with the implementation: this information is stored in a record of information about the application node. If we want to add a new feature of this kind, we can add a new field to the record, and most of the code (which uses the record by pattern-matching or projecting only a subset of its fields) will not need to be modified. (I would rather use a slightly different representation with a record for all informations instead of just the auxiliary information, as explained in this message, but the maintenance aspects discussed here are equivalent with both representations).

@pw374
Copy link

pw374 commented Jul 10, 2015

On 10 Jul 2015, at 15:15, Hongbo Zhang notifications@github.com wrote:
btw, for this feature, I think [@tailrec] is more meaningful, I just annotate functions starting with let rec, instead of annotating every call site.

A tailcall is not necessarily a tailrec call.
And a recursive tail call is not necessarily in a let rec.
The most general is "tailcall".
I could find tailrec useful if tailcall didn't exist, but that scenario wouldn't make much sense.

@bobzhang
Copy link
Member

@gasche see my comments in #181

stedolan added a commit to stedolan/ocaml that referenced this pull request Mar 14, 2017
chambart pushed a commit to chambart/ocaml-1 that referenced this pull request Aug 4, 2021
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Oct 5, 2021
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
c703f5f777 Incorporate upstream comments into type-variable refactor (ocaml#121)
362ba2349f Constrain curry modes to increase along applications (ocaml#108)
b1f0cf9f91 Simplify the extension handling (ocaml#114)
4fd53a1f6f Remove pat_mode from typedtree (ocaml#105)
cf6fcbc129 Handle attributes on lambdas with locally abstract types (ocaml#120)
5fa80fe23f Don't track attributes inside attributes for warning 53 (ocaml#115)
8a69777a3c Handle unclosed `[: ... :]` patterns (via `Generic_array` machinery) (ocaml#117)
b0737f46c4 Add promote-one Makefile target (ocaml#118)
c6ad684608 Refactoring and fixes around module lookup (ocaml#107)
b0a649516b Add documentation for global constructor arguments (ocaml#69)
dd79aeca91 Print `nlocal` in the `-d(raw)lambda` output (ocaml#112)
8035026661 Fix `nlocal` in the generated Lambda for list comprehensions (ocaml#113)
afbcdf0642 Immutable arrays (ocaml#47)
bfe1490dfb fix several issues when removing exp_mode (ocaml#110)
8f46060dc5 Better error message for under-applied functions (ocaml#74)
27331d848d Consistently use Lmutvar or Lvar in comprehensions (ocaml#111)
01e965b549 Skip failing test for now
0131357265 Fix test case to use comprehensions_experimental
22a73684b7 Temporarily disable list comprehensions tests due to locals bug
e08377d2d1 Make `comprehensions` into `comprehensions_experimental` for now (ocaml#109)
947cf892b5 List and array comprehensions (ocaml#46)
bd9e051 remove exp_mode from typedtree (ocaml#100)
a9268d2 Fix misplaced attribute warning when using external parser (and some cleanup) (ocaml#101)
2b33f24 Refactor toplevel local escape check (ocaml#104)
ed2aec6 Comment functions exported from TyVarEnv.
87838ba Move new variable creation into TyVarEnv.
a3f60ab Encapsulate functions that work with tyvars
43d83a6 Prevent possibility of forgetting to re-widen
2f3dd34 Encapsulate context when narrowing type env't
d78ff6d Make immediate64 things mode cross (ocaml#97)
aa25ab9 Fix version number (ocaml#94)
d01ffa0 Fix .depend file (ocaml#93)
942f2ab Bootstrap (ocaml#92)
05f7e38 Check Menhir version (ocaml#91)
1569b58 Move the CI jobs from 4.12 to 4.14. (ocaml#90)

git-subtree-dir: ocaml
git-subtree-split: c703f5f7772dd4252405b086be11c15a3c67f2ac
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet