Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More precise warning 51 about tail calls #10426

Closed
wants to merge 3 commits into from

Conversation

xavierleroy
Copy link
Contributor

As reported in #10424, calls annotated with [@ocaml.tailcall] can emit no warning, yet be turned into non-tail calls by the ocamlopt back-end, if too many arguments are provided and some of them need to be passed on stack.

This PR takes into account the number of registers available for argument passing, warning about a potential non-tail call if the number of arguments exceeds the number of registers minus 1. (Minus one for the implicit environment argument.)

The warning is conservative but not exact. Here are two false alarms:

  • Tail calls to self are always correctly implemented by the back-end (as a jump to the function entry point). It's only tail calls to other functions or to unknown functions that cause problems.
  • If the function called does not need the "self" environment parameter one more register is available for passing normal parameters.

The text of the warning is rather bad and could use an @Octachron touch.

In passing, this PR also increases the number of registers used for parameter passing on zSystems/s390x (from 5 to 8) and on POWER (from 8 to 16).

This makes tailcalls more usable on this platform.
This makes tailcalls even more usable on this platform.
We take into account the number of registers available for argument passing.
(If some arguments need to be passed on stack, the back-end cannot generate
a tail call.)

The warning is conservative but not exact.  Here are two false alarms:

- Tail calls to self are always correctly implemented by the back-end
  (as a jump to the function entry point).  It's only tail calls to
  other functions or to unknown functions that cause problems.

- If the function called does not need the "self" environment parameter,
  one more register is available for passing normal parameters.

Closes: ocaml#10424
@gasche
Copy link
Member

gasche commented May 24, 2021

This PR is solving one issue, but it introduces another one: now a piece of code will warn or not depending on the target backend. In particular, if users develop on a many-registers machines and have enabled this warning as an error (sounds reasonable), then their code breaks once they try to compile it on a less-registers machine. I think they will complain about this.

We could do some of the following:

  • Clarify in the documentation that this warning is system-dependent. This will not prevent any user pain (who reads documentation anyway?), but at least they will feel less justified in complaining to us. Ahem.
  • Use as a limit for this warning the worst of all architectures, which may be not-so-bad especially with your recent "in passing" changes to architectures that no know except you knows about (and then you will ask for prompt review of that code).
  • Change the compilation strategy to limit maximal arity to prevent spilling (at the cost of an extra allocation), as we discussed. (This is probably best kept for a separate PR.)

@gasche
Copy link
Member

gasche commented May 24, 2021

Remark: the fact that the code breaks is probably something that users will actually appreciate: if they use [@tailcall] explicitly they want guarantees. The problem is that it breaks later, once testing on a less-common architecture, possibly at their customer's / user's end.

@lthls
Copy link
Contributor

lthls commented May 24, 2021

There are a few corner cases with tupled functions which can cause the check to pass but still lead to non-tail calls.
If we want to offer guarantees to users of [@tailcall], we should probably do something about it.

A first simple one:

let[@inlined never] f (a, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) = a

let g () = (f[@tailcall]) ((), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), ())

A slightly more tricky one (the opaque_identity is only there to ensure that f is not called with an obvious tuple):

let rec f (a, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) =
  let tuple = (a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a) in
  (f[@tailcall]) (Sys.opaque_identity tuple)

Here we get different behaviours depending on the flambda setting. With flambda on, this is ultimately compiled to a self tail call, but without flambda this compiled to a tail call to the caml_tuplify20 function. This function then calls back f as a non-tail call, so the stack ends up growing at each call.

@craigfe
Copy link
Contributor

craigfe commented May 24, 2021

Here we get different behaviours depending on the flambda setting.

There is also this related issue about an existing flambda-sensitive behaviour of [@tailcall] that might be worth documenting as part of the same improvement. That case concerns a false-positive rather than a false-negative, so is perhaps not so problematic.

@xavierleroy
Copy link
Contributor Author

xavierleroy commented May 24, 2021

I hear #10424 as "power users want the OCaml compiler to tell what's really going on". That the warning depends on the target architecture reflects what's really going on in the back-end. I do regret that it cannot reflect exactly what's going on in the back-end and still has false positives because of that. But imposing the lowest common denominator between all platforms feels too far from what's really going on.

The issue with tupled functions is a concern. Not sure what to do about it yet.

I'm having cold feet w.r.t. the "limit function arities a priori" approach, because it will produce very poor code for self tail calls and non tail calls.

@alainfrisch
Copy link
Contributor

Probably a very naive question, but cannot tail calls be supported even when arguments need to be passed on the stack?

@xavierleroy
Copy link
Contributor Author

Probably a very naive question, but cannot tail calls be supported even when arguments need to be passed on the stack?

I tried a long time ago. Twice, actually. Both times I ran into a wall.

The main problem: currently, for the purpose of scanning the stack for GC roots, arguments passed on stack are described as part of the caller's frame. But in case of a non-tail call to a function that tail calls, this is incorrect: the arguments passed on stack should be described as part of the callee's frame (except before the call, when the arguments are being assembled on the stack, when they are still part of the caller's frame). I could not figure out how to do that.

The other problem is that the code that reshapes the stack frame at tailcall time becomes much more intricate, adding complexity to each back end.

@alainfrisch
Copy link
Contributor

Thanks, I see. I guess that always describing arguments as part of the callee's frame (after the call) would work in theory, but is a pretty large change. Or are you aware of drawbacks with describing arguments in the callee's frame?

@xavierleroy
Copy link
Contributor Author

I guess that always describing arguments as part of the callee's frame (after the call) would work in theory, but is a pretty large change.

"Always" is not possible: when the arguments are being prepared, before the actual call, they must belong to the caller's frame, as the callee's frame doesn't exist. The frame descriptors must reflect this hand-off of stack slots from the caller to the callee. Also, the return address is no longer at the top of the frame, but somewhere in the middle, with the arguments passed on stack above it.

Or are you aware of drawbacks with describing arguments in the callee's frame?

More complex stack frame descriptors (to the point that I'm not sure what should go in them) + significant changes in all code emitters to implement "Pascal" treatment of arguments passed on stack (the callee is responsible for popping the arguments on return, not the caller, like C does and OCaml currently does too).

@xavierleroy
Copy link
Contributor Author

I'm not happy with this PR, so I'm closing it. See #10578 for the part of this PR that I intend to commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants