At tailcall #111

c-cube · 2014-10-29T23:28:50Z

Add an attribute [@tailcall] such that f args [@tailcall] warns (with the new warning 50) if the call is not tail-recursive. See the ocaml hacking item.

The patch is twofold:

A Lapply expression (in lambda code) is annotated with a record, rather than a location,
with a boolean within this record that specifies whether the call is expected to be in tail position. This flag
is set if the corresponding Typedtree.expr has a "tailcall" attribute.
In Simplif, if the warning is enabled, tail calls are annotated, and if a call is expected to be in tail position
but isnt't, the warning is emitted.

Also added two tests in testsuite/tests/warnings/.

yallop · 2014-10-29T23:35:15Z

/cc @euanh

euanh · 2014-10-30T09:48:58Z

/cc @simonjbeaumont

simonjbeaumont · 2014-10-30T09:58:43Z

Cool! I've also had a stab at the same problem. I also found that doing this in the compiler would mean threading through the annotations into the Lambda layer (passed the Transl stage).

It seemed like that was a clear separation from the Typedtree to the Lambda so instead I wrote a cmt parser:

https://github.com/simonjbeaumont/ocaml-tailrec

It can be compiled with make and has two more targets: make test_pass and make test_fail. It is a bit of a proof of concept at the moment. Sadly the extension point ppx strategy didn't suffice since it only gives you a hook at the untyped AST stage.

Cool to see this in the compiler though. I'd hacked up a similar approach but not quite made it.

c-cube · 2014-10-30T10:03:19Z

A ppx strategy would be useful to implement [@tailrec] in terms of [@tailcall] (as suggested there).

It's interesting to notice that this PR uses the function Simplif.emit_tail_infos, that used to be called only when generating .cmt files, to check whether annotated call sites are indeed in tail position.

chambart · 2014-10-30T10:14:43Z

bytecomp/simplif.ml

@@ -516,7 +516,9 @@ let rec emit_tail_infos is_tail lambda =
  match lambda with
  | Lvar _ -> ()
  | Lconst _ -> ()
-  | Lapply (func, l, loc) ->
+  | Lapply (func, l, ({apply_loc=loc} as info)) ->
+      if info.apply_should_be_tailcall && not is_tail


You should test here if the warning is enabled. emit_tail_info, is allso called when -annot is used.

chambart · 2014-10-30T10:15:45Z

You should enable this warning by default since it won't warn if you didn't add annotations explicitely.

chambart · 2014-10-30T10:18:40Z

bytecomp/simplif.ml

-  | Lapply (func, l, loc) ->
+  | Lapply (func, l, ({apply_loc=loc} as info)) ->
+      if info.apply_should_be_tailcall && not is_tail
+        then Location.prerr_warning loc Warnings.Expect_tailcall;
      list_emit_tail_infos false l;
      Stypes.record (Stypes.An_call (loc, call_kind l))


You should test if -annot is set before emitting informations

simonjbeaumont · 2014-10-30T11:34:58Z

First off, hats off to @c-cube! I'm pleased I wasn't the only one who thought it might be useful.

I think @thomassa makes a good point. I found that the natural way of annotating the final line of with [@tailcall] you end up capturing a tail-call that is not the application of the function you want to assert is tail-recursive.

That was partly why I abandoned that approach and went instead for annotating a function definition as [@tailrec] (cf. annotating the call-site). Then I checked that all applications of this function within the body were tail-calls which felt a little better.

Is it not also a little suspect to thread through the annotation from the TypedTree to Lambda for just the Lapply. It kind of feels like annotations should exist in the Lambda stage or not (not just for one of the variants).

lpw25 · 2014-10-30T12:12:14Z

Why not just add support for f [@tailcall] x y? Being able to accurately annotate function calls seems useful for quite a few things, and it looks quite simple to add this case. Note that, for consistency, the attribute should be attached to f rather than f x y in the AST since that is what f [@foo] means in other contexts.

c-cube · 2014-10-30T12:40:22Z

@lpw25 I guess a Lapply (f, args, _) should also be tailcall if f has been translated from a [@tailcall]'ed expression, so that f [@tailcall] a b c works. I think this version is more readable too, especially w.r.t. parenthesing.

Otoh I don't think ditching [@tailcall] in favor of [@tailrec] is enough. Both can be useful. Most of the time [@tailrec] is handy, but it can be expressed (with a ppx?) in terms of [@tailcall]. For some other uses -- I have in mind continuation-passing style -- more flexibility is needed.

This CPS example makes me wonder which syntax would be nice for expressing the constraint that in

let f x y z cont = ...

the continuation cont should always be called in tail-position, precisely not to eat stack space. This can also be expressed in terms of [@tailcall] but I don't know how to express the higher-level abstraction.

lpw25 · 2014-10-30T12:53:30Z

@lpw25 I guess a Lapply (f, args, _) should also be tailcall if f has been translated from a [@tailcall]'ed expression, so that f [@tailcall] a b c works. I think this version is more readable too, especially w.r.t. parenthesing.

Yes, and you need to add support for f [@foo] x y to the parser. This probably just means changing | simple_expr simple_labeled_expr_list to | simple_expr attribute simple_labeled_expr_list in expr, but I haven't checked it properly for conflicts.

gasche · 2014-10-30T13:26:57Z

I think we should make a choice between either fact (n-1) [@tailcall] or (fact [@tailcall]) (n-1), rather than support both. I'm personally in favor of (fact [@tailcall]) (n-1), which I find more natural and robust, and would support enabling the id [@attribute] syntax to be used as applicative-left-hand-side (but that should be a separate patch).

yallop · 2014-10-30T13:37:35Z

Ideally I think we'd have a prefix attribute: [^tailcall] fact (n - 1), since tailcall behaves rather like a built-in operation. Perhaps fact [@tailcall] (n-1) is the next best thing.

c-cube · 2014-10-30T18:22:23Z

It looks like supporting fact [@tailcall] (n-1) does introduce conflicts, sadly. For now could we settle on the uglier (fact [@tailcall]) (n-1), which at least is robust, and deal with parser issues separatly?

pw374 · 2014-10-30T18:25:27Z

To me it'd make much more sense if [@tailcall] had to be prefix. o_O

c-cube · 2014-10-30T18:45:34Z

@pw374 that would make sense, I agree. A solution could be to allow [%tailcall fact (n-1)] and then rewrite this, internally, into (fact (n-1) [@tailcall])... At least it looks like "tailcall" is a primitive operation.

lpw25 · 2014-10-30T19:46:38Z

It looks like supporting fact @tailcall does introduce conflicts, sadly.

For the record, I believe this conflict is artificial and could be avoided by replacing the rule:

| expr attribute

in the definition of expr by an attributes on the end of every rule in expr. However, this is quite a bit of work, so may not be worth the effort. Also the use of various precedences in the expression grammar makes it difficult to judge the correctness of this transformation.

c-cube · 2014-10-30T19:56:06Z

Just added a preprocessing phase to compilation so that [%tailcall foo] becomes foo [@tailcall]. This phase will also be useful for [@tailrec] or other annotations that can rely on [@tailcall].

I had issues with the build system, so there are almost surely lots of mistakes.

gasche · 2014-10-31T08:32:34Z

I'm not happy with [%tailcall ]. Extensions and attributes have a clear semantics: extensions is for what is not valid OCaml code, attributes are to add cross-cutting concerns to valid OCaml code. tailcall is an attribute, not an extension.

Please:

stop trying to abuse the extension mechanism to your particular aesthetic taste. The very point of extensions is to have a more robust structure than Camlp4 at the cost of a heavier syntax, and repeatedly turning this into a steaming pile of hacks will not help in the long-term
pick one syntax for the tailcall annotation that is relatively consensual and use it. Not two syntaxes that rewrite to each other, or to avoid making a choice.

Syntax design is hard. The solution is not to allow everything.

(We can discuss the conflict issue later. Maybe careful massaging of the grammar would let us have the nice things. We can try that later if the chosen syntax for the attribute leds itself to this extension without introducing two separate syntaxes.)

c-cube · 2014-10-31T10:36:24Z

@gasche: ok, fine. I'll get back to f x y z [@tailcall] then, it's less invasive to the compiler and parser, and is quite intuitive. The preprocessing phase could still be useful for encoding [@tailrec] and the likes (annotate some function argument with [@continuation]? Would be quite useful in my code actually!) in terms of [@tailcall], but that's your choice.

About f [@tailcall] x y z, my opinion is that it reads a bit weird. What should be the meaning of f x [@tailcall] y z? Would it be (f x) [@tailcall] y z? It does look like a syntactic hack to me. People can write parenthesis when required.

Drup · 2014-10-31T11:25:30Z

I think it reads much less weird if you format it f[@tailcall] x y z. The attributes is attached to the previous expression, and that's pretty much it. I think it's quite simple. We all know whitespace insensitiveness can be weird at times, it's not an especially surprising example.

thomassa · 2014-10-31T12:39:13Z

@c-cube wrote:

What should be the meaning of f x [@tailcall] y z? Would it be (f x) [@tailcall] y z?

If it were valid then that's how I would read it, certainly: as saying that there is a tail-call of the function that results from applying f to x. I think it amounts to the same thing though: if we think of f a1 a2 a3 a4 as (((f a1) a2) a3) a4 then it starts looking as if the last call is not to f (which just acts on a1 to return a function which is then applied to a2) but rather to the anonymous function returned from (((f a1) a2) a3)... but it would be silly to have to write f a1 a2 a3 [@tailcall] a4.

@yallop: I agree.

@c-cube wrote:

This CPS example makes me wonder which syntax would be nice for expressing the constraint that in
let f x y z cont = ...
the continuation cont should always be called in tail-position

I agree that it would be nice to have some way to express that sort of constraint.
What exactly should the constraint allow though? Should function f be allowed to pass cont to some other function g instead of calling cont within f? If so, should g have to be in tail-position when called from f? When g receives cont, should the constraint apply there as well? I start to speculate vaguely about the notion of a special type or variety of function: one that can be called only as a tail-call (and passed as an argument only in a tail-call? Tricky...) (And perhaps a way to obtain a tail-only function from any function? Something with the feel of let tc_cont = tailonly cont in ...)

gasche · 2014-10-31T16:21:50Z

The CPS example is interesting because it does not seem to differ much from the proposed semantics of [@tailrec], despite being attached to a function argument rather than the function itself. It suggests that we may want to allow [@tailcall] to be attached to binding sites (rather than just call-sites) with the following semantics: all use-sites for this variable must be tail-calls.

This immediately suggests having an annotation [@nontail] (do we want to use [@tail] rather than [@tailcall]?) to explicitly mark some call-sites as "not tail-call but this is intentional and don't warn here despite the variable is bound with [@tail(call)]". This happens in practice with partially-CPS code on non-recursive data structures.

Note that there is still a difference between this proposed semantics of "[@tail] on binders" and the semantics of [@tailrec], as [@tailrec] would be applied to a definition-binding (let, with both a definition-body and a use-body) rater than abstraction-bindings (fun, only a use-body). The natural extension for [@tailcall] in definition-bindings is to warn about non-tail calls in the definition body and in the use body, and [@tailrec] behaves differently (it scopes exactly on the mutual recursion block but does not warn about non-tail uses outside it). I considered suggesting that this should always be the semantics of [@tailcall] on definition bindings, but keeping the warning on both definiton and use-sites is also a useful feature, consider code such as the following (not unusual):

let rec (f [@tailrec]) env param =
  let (f' [@tail]) = f env param in
  function
  | ... -> ... (f' x)
  | ... -> ... (f' y) ...
  | ... -> ... (f env' param z)

The call to f inside f' is in tail-position, but that doesn't help much if f' itself isn't also called in tail-position, which the [@tail(call)] annotation on its binding site concisely ensures -- with a semantics different from [@tailrec].

The preprocessing phase could still be useful for encoding [@tailrec] and the likes.

On the side of the implementation, it is indeed a reasonable idea to "implement" the [@tailrec] semantics by weaving additional [@tailcall] annotations through the code (on all call-sites in the mutual recursion block that are not already marked [@nontail]). (But it should be done as a post-preprocessing phase rather than at the same time as -ppx to avoid polluting the -dsource output). This weaving can be done using Ast_mapper, that is the same machinery used by ppx extensions. I'm not opposed to that, but this is strictly an implementation device (and maybe not the only one¹) rather than a user-exposed additional syntax.

¹: one problem with implementing [@tailrec] or [@tail(call)]-on-binders with Ast_mapper is that you would have to maintain an environment to avoid variable capture (a local let could hide the recursive function name and its own use should not be marked [@tailcall]), duplicating some work that (1) is done by the type-checker later on (2) could be done directly in simplif.ml to avoid having an additional traversal pass (done more efficiently as you already have the shadowing-avoiding unique names generated by the type-checker).

gasche · 2014-10-31T16:34:40Z

@yallop: indeed, prefix attributes are a reasonable idea -- and it can generalize to other use-case. That said, I'm not convinced it is necessary in this case, and I'm afraid the extension/attribute syntax is quite complex already (@, @@, @@@...), so I'm not sure we want to add yet another dimension of complexity. In any case, if people agree on (f [@tailcall]), it means we can always switch to ([^tailcall] f) -- they mean the same AST.

alainfrisch · 2014-10-31T17:12:56Z

In the same vein as:

begin[@foo] f end

it had been discussed to support:

([@foo] f)

but this was rejected during the discussion. I'd still prefer this form over using yet another different symbol.

thomassa · 2014-10-31T17:36:53Z

@alainfrisch: Oh, I hadn't realised there was anywhere begin ... end could not be written as ( ... ).

@gasche: your idea of attaching the attribute to the binding-site is interesting, but I'm not clear on whether or not it would allow something like this:

let g t =
  ...
  t args

let f (t [@tail]) more_args =
  ...
  h (g t)

The significant point there is that although when t is called (in g) it is in tail-position, f's call to g is not a tail-call.
I think ideally we'd like it to forbid that, but allow f to end with g t since this has no stack-space cost.

damiendoligez · 2014-12-08T19:53:19Z

Looks good to me.

c-cube · 2015-03-16T12:14:44Z

ping @gasche for merging? ;)

gasche · 2015-04-12T18:29:04Z

Merged in trunk. Thanks for the patch and the compromises!

I remember at the OCaml coding sprint when we discussed the code, I was unconvinced by your choice of extending the loc parameter into a "record of non-important stuff", and discussed turning the whole payload of the Lapply constructor into a record instead. Now that #133 has been merged, I even more strongly think that this would have been the right choice. That said, the code works as is, and I don't have the time to rework it with the other representation. Anyone with the same impression should feel free to propose a patch to use a record for all parameters of the Lapply case.

Drup · 2015-04-14T23:53:23Z

@gasche I don't see this in trunk.

Drup · 2015-04-15T00:02:11Z

Never mind. Apparently the git mirror is lagging behind.

bobzhang · 2015-07-09T21:25:37Z

I have a dumb question, what is the use case for tailcall, is not it obvious by reading the code to see if it's tailcall. this seems to compilicate lambda intermediate representation for not too much benefit?

pw374 · 2015-07-09T21:51:47Z

Sometimes one may write a tail call, then their code evolves and the call isn't a tail one any more. Typically one write a tail-recursive function, and then adds something to handle some exception, which wrecks the tail position.

On 09 Jul 2015, at 23:25, Hongbo Zhang notifications@github.com wrote:

I have a dumb question, what is the use case for tailcall, is not it obvious by reading the code to see if it's tailcall. this seems to compilicate lambda intermediate representation for not too much benefit?

—
Reply to this email directly or view it on GitHub.

thomassa · 2015-07-10T08:35:33Z

@bobzhang In practice, such mistakes are common enough that many people feel the feature is worthwhile.

bobzhang · 2015-07-10T13:15:28Z

I agree it can be useful in some cases. but the implementation is not very modular, shall we design what kind of information should be passed down to lambda IL, if tomorrow we have another useful feature needed, shall we bloat lambda IL more?

btw, for this feature, I think [@tailrec] is more meaningful, I just annotate functions starting with let rec, instead of annotating every call site.

gasche · 2015-07-10T13:22:45Z

I don't see a problem with the implementation: this information is stored in a record of information about the application node. If we want to add a new feature of this kind, we can add a new field to the record, and most of the code (which uses the record by pattern-matching or projecting only a subset of its fields) will not need to be modified. (I would rather use a slightly different representation with a record for all informations instead of just the auxiliary information, as explained in this message, but the maintenance aspects discussed here are equivalent with both representations).

pw374 · 2015-07-10T13:31:00Z

On 10 Jul 2015, at 15:15, Hongbo Zhang notifications@github.com wrote:
btw, for this feature, I think [@tailrec] is more meaningful, I just annotate functions starting with let rec, instead of annotating every call site.

A tailcall is not necessarily a tailrec call.
And a recursive tail call is not necessarily in a let rec.
The most general is "tailcall".
I could find tailrec useful if tailcall didn't exist, but that scenario wouldn't make much sense.

bobzhang · 2015-08-14T03:36:58Z

@gasche see my comments in #181

Fix GC stats

c703f5f777 Incorporate upstream comments into type-variable refactor (ocaml#121) 362ba2349f Constrain curry modes to increase along applications (ocaml#108) b1f0cf9f91 Simplify the extension handling (ocaml#114) 4fd53a1f6f Remove pat_mode from typedtree (ocaml#105) cf6fcbc129 Handle attributes on lambdas with locally abstract types (ocaml#120) 5fa80fe23f Don't track attributes inside attributes for warning 53 (ocaml#115) 8a69777a3c Handle unclosed `[: ... :]` patterns (via `Generic_array` machinery) (ocaml#117) b0737f46c4 Add promote-one Makefile target (ocaml#118) c6ad684608 Refactoring and fixes around module lookup (ocaml#107) b0a649516b Add documentation for global constructor arguments (ocaml#69) dd79aeca91 Print `nlocal` in the `-d(raw)lambda` output (ocaml#112) 8035026661 Fix `nlocal` in the generated Lambda for list comprehensions (ocaml#113) afbcdf0642 Immutable arrays (ocaml#47) bfe1490dfb fix several issues when removing exp_mode (ocaml#110) 8f46060dc5 Better error message for under-applied functions (ocaml#74) 27331d848d Consistently use Lmutvar or Lvar in comprehensions (ocaml#111) 01e965b549 Skip failing test for now 0131357265 Fix test case to use comprehensions_experimental 22a73684b7 Temporarily disable list comprehensions tests due to locals bug e08377d2d1 Make `comprehensions` into `comprehensions_experimental` for now (ocaml#109) 947cf892b5 List and array comprehensions (ocaml#46) bd9e051 remove exp_mode from typedtree (ocaml#100) a9268d2 Fix misplaced attribute warning when using external parser (and some cleanup) (ocaml#101) 2b33f24 Refactor toplevel local escape check (ocaml#104) ed2aec6 Comment functions exported from TyVarEnv. 87838ba Move new variable creation into TyVarEnv. a3f60ab Encapsulate functions that work with tyvars 43d83a6 Prevent possibility of forgetting to re-widen 2f3dd34 Encapsulate context when narrowing type env't d78ff6d Make immediate64 things mode cross (ocaml#97) aa25ab9 Fix version number (ocaml#94) d01ffa0 Fix .depend file (ocaml#93) 942f2ab Bootstrap (ocaml#92) 05f7e38 Check Menhir version (ocaml#91) 1569b58 Move the CI jobs from 4.12 to 4.14. (ocaml#90) git-subtree-dir: ocaml git-subtree-split: c703f5f7772dd4252405b086be11c15a3c67f2ac

add all OCaml workshops

chambart reviewed Oct 30, 2014
View reviewed changes

c-cube force-pushed the at_tailcall branch 2 times, most recently from 9186116 to 594d444 Compare October 30, 2014 20:36

c-cube force-pushed the at_tailcall branch from 594d444 to 7e60249 Compare October 31, 2014 10:37

gasche mentioned this pull request Dec 16, 2014

Automatically insert source location #126

Open

c-cube force-pushed the at_tailcall branch from 82f4f69 to 2a38d8d Compare January 13, 2015 21:06

c-cube added 4 commits January 13, 2015 22:44

implementation of @tailcall, new warning 50 (Expect_tailcall), + tests

bc801a5

only emit warning 50 (Expect_tailcall) if its active

e340791

in Simplif.emit_tail_infos, only emit annotations if required

75b76d5

favor (f [@tailcall]) x y rather than (f x y) [@tailcall]

e49f054

c-cube force-pushed the at_tailcall branch from 2a38d8d to e49f054 Compare January 13, 2015 21:44

damiendoligez mentioned this pull request Mar 16, 2015

Attach documentation comments to Parsetree #149

Closed

gasche closed this Apr 12, 2015

nojb mentioned this pull request Apr 18, 2015

Turn Lapply case of Lambda.lambda into a record #169

Merged

trefis mentioned this pull request Apr 22, 2015

Add an inline attribute on functions #173

Closed

stedolan added a commit to stedolan/ocaml that referenced this pull request Mar 14, 2017

Merge pull request ocaml#111 from ocamllabs/gc-stats

4860a5e

Fix GC stats

vicuna mentioned this pull request Mar 14, 2019

OCaml SIGSEGV in invert_pointer_at() (OCaml Garbage Collector / Compaction) #7431

Closed

chambart pushed a commit to chambart/ocaml-1 that referenced this pull request Aug 4, 2021

Use Closure for stage 0 in Flambda 2 mode (ocaml#111)

9ab3ba1

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Oct 5, 2021

Use Closure for stage 0 in Flambda 2 mode (ocaml#111)

3375a8c

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

Homepage: update stats copy (ocaml#111)

e4088b7

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

Merge pull request ocaml#111 from patricoferris/add-all-workshops

bbaef0f

add all OCaml workshops

At tailcall #111

At tailcall #111

Conversation

c-cube commented Oct 29, 2014

yallop commented Oct 29, 2014

euanh commented Oct 30, 2014

simonjbeaumont commented Oct 30, 2014

c-cube commented Oct 30, 2014

chambart Oct 30, 2014

Choose a reason for hiding this comment

chambart commented Oct 30, 2014

chambart Oct 30, 2014

Choose a reason for hiding this comment

simonjbeaumont commented Oct 30, 2014

lpw25 commented Oct 30, 2014

c-cube commented Oct 30, 2014

lpw25 commented Oct 30, 2014

gasche commented Oct 30, 2014

yallop commented Oct 30, 2014

c-cube commented Oct 30, 2014

pw374 commented Oct 30, 2014

c-cube commented Oct 30, 2014

lpw25 commented Oct 30, 2014

c-cube commented Oct 30, 2014

gasche commented Oct 31, 2014

c-cube commented Oct 31, 2014

Drup commented Oct 31, 2014

thomassa commented Oct 31, 2014

gasche commented Oct 31, 2014

gasche commented Oct 31, 2014

alainfrisch commented Oct 31, 2014

thomassa commented Oct 31, 2014

damiendoligez commented Dec 8, 2014

c-cube commented Mar 16, 2015

gasche commented Apr 12, 2015

Drup commented Apr 14, 2015

Drup commented Apr 15, 2015

bobzhang commented Jul 9, 2015

pw374 commented Jul 9, 2015

thomassa commented Jul 10, 2015

bobzhang commented Jul 10, 2015

gasche commented Jul 10, 2015

pw374 commented Jul 10, 2015

bobzhang commented Aug 14, 2015