Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Sys.frame_pointers #9842

Closed
wants to merge 2 commits into from
Closed

Add Sys.frame_pointers #9842

wants to merge 2 commits into from

Conversation

lpw25
Copy link
Contributor

@lpw25 lpw25 commented Aug 14, 2020

I recently had a user ask how to know whether the current OCaml program was compiled with frame pointers. I was slightly surprised that it wasn't available in Sys, so this PR makes that information available in Sys.

Copy link
Contributor

@mshinwell mshinwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and I think would be a useful addition.

stdlib/sys.mlp Outdated
@@ -35,6 +35,7 @@ external unix : unit -> bool = "%ostype_unix"
external win32 : unit -> bool = "%ostype_win32"
external cygwin : unit -> bool = "%ostype_cygwin"
external get_backend_type : unit -> backend_type = "%backend_type"
external frame_pointers : unit -> bool = "caml_sys_frame_pointers"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this noalloc.

Copy link
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addition of Sys.frame_pointers I agree is fine, but we don't need a primitive to expose a static constant... patch for generating it in the same way as Config.with_frame_pointers enclosed as review comments.

stdlib/sys.mlp Outdated
@@ -45,6 +46,7 @@ let int_size = int_size ()
let unix = unix ()
let win32 = win32 ()
let cygwin = cygwin ()
let frame_pointers = frame_pointers ()
Copy link
Member

@dra27 dra27 Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let frame_pointers = frame_pointers ()
let frame_pointers = %%WITH_FRAME_POINTERS%% && backend_type = Native

and delete the external above (cf. utils/config.mlp)

runtime/sys.c Outdated
Comment on lines 643 to 650
CAMLprim value caml_sys_frame_pointers(value unit)
{
#if defined(NATIVE_CODE) && defined(WITH_FRAME_POINTERS)
return Val_true;
#else
return Val_false;
#endif
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for this primitive. Edit stdlib/Makefile instead:

--- a/stdlib/Makefile
+++ b/stdlib/Makefile
@@ -180,8 +180,11 @@ stdlib.cma: $(OBJS)
 stdlib.cmxa: $(OBJS:.cmo=.cmx)
        $(CAMLOPT) -a -o $@ $^

-sys.ml: $(ROOTDIR)/VERSION sys.mlp
-       sed -e "s|%%VERSION%%|`sed -e 1q $< | tr -d '\r'`|" sys.mlp > $@
+VERSION = $(shell sed -e 1q $(ROOTDIR)/VERSION | tr -d '\r')
+sys.ml: sys.mlp $(ROOTDIR)/VERSION
+       sed $(call SUBST_STRING,VERSION) \
+           $(call SUBST,WITH_FRAME_POINTERS) \
+           $< > $@

 .PHONY: clean
 clean::

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Is adding complexity to the Makefile better than adding code in C and OCaml? I'm unsure that it is. What about the dune build?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how utils/config.ml is generated - it adds no complexity at all... the functions are already there. Dune is irrelevant to the consideration (FWIW, it would require a switch for stdlib/sys.ml to be generated in the same way config.ml, which happens to be by calling make).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is adding complexity to the Makefile better than adding code in C and OCaml?

We are big fans of doing things statically rather than dynamically so yes, definitely. Especially when all the machinery is already available.

@dra27
Copy link
Member

dra27 commented Aug 14, 2020

As an aside, is linking compiler-libs and using Config not an acceptable solution to the original problem?

@mshinwell
Copy link
Contributor

I don't think forcing programs to link compiler-libs to read this property is a good idea. This sort of property might be useful for programs to introspect themselves, and you wouldn't want to have to link all of those programs against compiler-libs.

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 14, 2020

@dra27 I'm not keen on doing the frame_pointers value completely differently from all the other Sys values. Why would your suggested change not apply to those?

@dra27
Copy link
Member

dra27 commented Aug 14, 2020

frame_pointers is I believe completely different from all the others - it's actually constant (everything else is from the environment)

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 14, 2020

I don't understand what you mean by the environment. This seems exactly the same as my frame pointers function:

CAMLprim value caml_sys_const_big_endian(value unit)
{
#ifdef ARCH_BIG_ENDIAN
  return Val_true;
#else
  return Val_false;
#endif
}

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 14, 2020

Or perhaps this one is a better example since the other one is only used in bytecode:

CAMLprim value caml_sys_get_config(value unit)
{
  CAMLparam0 ();   /* unit is unused */
  CAMLlocal2 (result, ostype);

  ostype = caml_copy_string(OCAML_OS_TYPE);
  result = caml_alloc_small (3, 0);
  Field(result, 0) = ostype;
  Field(result, 1) = Val_long (8 * sizeof(value));
#ifdef ARCH_BIG_ENDIAN
  Field(result, 2) = Val_true;
#else
  Field(result, 2) = Val_false;
#endif
  CAMLreturn (result);
}

@dra27
Copy link
Member

dra27 commented Aug 14, 2020

os_type, big-endian etc. are properties of the runtime, so you can't know them until you're linked with a runtime (for native code of course, most if not all these things are actually constant) - frame pointers is a property of both the runtime and the code you just built - i.e. it's statically known when you build the module rather than only when you run it.

Sorry about this - I wasn't expecting putting true or false directly in an OCaml file versus writing a C function to return a statically determined 1 or 0 as an OCaml value and adding more OCaml to run that function at startup to be quite such a drama!

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 14, 2020

I wasn't expecting putting true or false directly in an OCaml file

To be fair you are proposing doing meta-programming in a Makefile -- I'll take C over that any day.

so you can't know them until you're linked with a runtime

Ok, so you're saying that the reason that the rest of Sys is not implemented that way is that in bytecode the values are not actually constant. Just to check, we really allow bytecode to run on an ocamlrun that has a different endianness, word size, etc. than when compiling?

@dra27
Copy link
Member

dra27 commented Aug 14, 2020

To be fair you are proposing doing meta-programming in a Makefile -- I'll take C over that any day.

Sure, but as I said the file is already "meta-programmed" for ocaml_version

Ok, so you're saying that the reason that the rest of Sys is not implemented that way is that in bytecode the values are not actually constant. Just to check, we really allow bytecode to run on an ocamlrun that has a different endianness, word size, etc. than when compiling?

Endianness I don't know (that was all put in for #5774) - word_size and int_size definitely - you can run bytecode on both 32bit and 64 bit runtimes.

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 14, 2020

Ok, let's do it your way then. I might add a comment somewhere too to try and make it clearer why the current code is how it is.

@xavierleroy
Copy link
Contributor

Just to check, we really allow bytecode to run on an ocamlrun that has a different endianness, word size, etc. than when compiling?

And different OS family, yes. Bytecode is that portable. That's why the corresponding constants in Sys must be determined at program start-up time. On the other hand, features of the runtime system that are not relevant to bytecode, like the Sys.frame_pointers proposed here, can be determined statically.

@xavierleroy
Copy link
Contributor

There's a variant of Wadler's law that says that the most innocuous-looking OCaml pull requests are the most scrutinized. Let me continue this tradition.

  • Is the "frame pointers" variant of ocamlopt still relevant? Otherwise I'd love to get rid of it and the additional warts it creates in the amd64 code generator. For the record, It was introduced to support "perf" profiling at the time when "perf" would not use anything but frame pointers to analyze call stacks. I've heard that for several years already "perf" has been able to use DWARF call-frame information to analyze call stacks.

  • I need to be convinced that an OCaml program needs to behave differently whether frame pointers are maintained or not. Introspection is going very far here... Any concrete example?

  • The alternative approach of using the Config module from compilerlibs was brushed away too quickly.

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 17, 2020

Is the "frame pointers" variant of ocamlopt still relevant?

Unfortunately the performance of "perf" using DWARF call-frame information is terrible. Since reading the DWARF live is too slow it basically memcpys the entire stack into the profile and decodes it later. This leads to extremely large profiles and needing to use much lower sampling rates. As upsetting as it is to dynamically calculate statically known data, compiling with frame pointers remains the best way to profile OCaml programs. One day I hope that perf will be able to take backtraces based on a static description of the stack -- for example linux has the ORC project to solve basically the same problem -- but it doesn't look like happening any time soon.

I need to be convinced that an OCaml program needs to behave differently whether frame pointers are maintained or not.

The specific use case where this came up was for a process to call perf on itself. In particular, it is useful to be able to send an RPC to a service in production and have it return data from perfing itself. Here you want to tell perf to use frame-pointers if they are available and fall back to DWARF otherwise.

The alternative approach of using the Config module from compilerlibs was brushed away too quickly.

We would like to enable this self-perfing functionality on the majority our executables. IIUC using Config would force us to link compiler-libs into all these executables, which I'd rather not do.

@mshinwell
Copy link
Contributor

perf also supports recording via the processor's LBR history on Linux x86-64, but the feedback from users is that this is problematic too. One particular issue here is that until you start recording the LBR information, you don't have any history, so you can't work out the call stack if you start profiling "too late".

@avsm
Copy link
Member

avsm commented Aug 17, 2020

@lpw25 wrote:

One day I hope that perf will be able to take backtraces based on a static description of the stack -- for example linux has the ORC project to solve basically the same problem -- but it doesn't look like happening any time soon.

My understanding is that ORC is now pretty mainstream in Linux (since v4.14 or so) for kernel unwinding, but that perf does not yet have support for it for userspace binaries. If that's really the last blocker to get rid of pesky frame pointers, this seems like it might be worth pushing over the finish line in perf.

The fp variant is so arch/os specific (Linux/amd64 only) that it feels weird to expose it generically in the Sys module. In your example, why can't the build system set a configuration value in the binary being built to signify that it has been built with frame pointers (similarly to how dune-build-info works)?

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 17, 2020

I've pushed @dra27's version of the patch and added a Changes entry.

If that's really the last blocker to get rid of pesky frame pointers, this seems like it might be worth pushing over the finish line in perf.

My understanding is that using ORC in perf would increase the amount of trust being placed in the ORC unwinder from a security perspective, so I don't think all the remaining hurdles are technical. Even if perf supported ORC we would still need to implement code to output ORC for OCaml programs, either during compilation or afterwards based on the DWARF or frame tables. This is all a significant amount of work, so I'm not holding my breath for it.

In your example, why can't the build system set a configuration value in the binary being built to signify that it has been built with frame pointers (similarly to how dune-build-info works)?

That would be possible, but it is much more work for a result that is less reusable and harder to maintain. To be honest, if I'm going to make something that will only work at Jane Steet anyway, I'll probably just add this patch to the lengthening list of patches on our version of the compiler, since its maintenance is trivial.

@mshinwell mshinwell marked this pull request as ready for review August 17, 2020 12:56
Copy link
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems inappropriate for me to mark as "approved" an implementation I have proposed!

I asked about whether Config was a solution to the original problem in case it had been forgotten completely - it sounds as though it wasn't and I agree it doesn't sound like a general solution (cf. linkall, etc.)

There is another possibility: we already install config.cmx and config.cmi - it would be a very small and coherent alternative to install config.o (and, for completeness, config.cmo) which would mean you could specifically link config.cmx, rather than the whole of ocamlcommon.cmxa. Config is - and always should be - a self-contained module.

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 17, 2020

which would mean you could specifically link config.cmx, rather than the whole of ocamlcommon.cmxa

This might be fine, although I think it would need to come with some kind of agreement that changes to its interface would be breaking changes. Maybe not full on stdlib level support, but some notion that this was a public interface and that it won't be changed in bugfix releases or rewritten from scratch on a whim.

@dra27
Copy link
Member

dra27 commented Aug 17, 2020

I don't think that guarantee would be problematic. I had a check through its history... since 4.00, there have only been 3 breaking changes to Config:

  • 4.06.0: bytecomp_c_compiler and native_c_compiler merged into c_compiler (necessary, and broke everything - it affected the output of ocamlc -config as well)
  • 4.08.0: removal of standard_runtime (no longer used) and load_path (which arguably never belonged in Config)
  • 4.09.0: removal of profiling and cc_profile, as gprof was removed

I don't think we'd have suffered by having to mark profiling and cc_profile as deprecated with hard-coded false and "" values - the others would have been acceptable changes since anything using them would have needed considerable updating (as indeed the 4.06 changes did)

@zlandau
Copy link

zlandau commented Aug 17, 2020

My understanding is that ORC is now pretty mainstream in Linux (since v4.14 or so) for kernel unwinding, but that perf does not yet have support for it for userspace binaries. If that's really the last blocker to get rid of pesky frame pointers, this seems like it might be worth pushing over the finish line in perf.

Note that BPF also doesn't support ORC yet either. I suspect it's for the same reason, but ideally both perf and BPF would support it before frame pointer support is removed.

@lpw25
Copy link
Contributor Author

lpw25 commented Aug 20, 2020

@xavierleroy I think I've addressed the concerns you raised in my earlier comment. Are you satisfied that frame_pointers can reasonably be exposed in Sys? Or would you prefer that we do something more like David's suggestion and instead expose the Config module outside of compiler-libs?

@xavierleroy
Copy link
Contributor

Are you satisfied that frame_pointers can reasonably be exposed in Sys?

I'm neutral. It's not a big problem to add this to Sys, but the need still feels very specific to one use at Jane Street.

@Octachron
Copy link
Member

I have the impression that making config a small library would make more sense: few programs needs that level of introspection on how they were compiled. Stabilizing the config.ml API by making it an append-only information sounds lightweight enough.

@lpw25
Copy link
Contributor Author

lpw25 commented Sep 10, 2020

I'm not entirely sure what making config into a small library will entail. Any chance one of the proponents of this approach could make a PR? I'd quite like to have a solution to this.

@damiendoligez
Copy link
Member

What about adding Config to the standard library? It's not a lot of code, and then the backward-compatibility requirements will be obvious.

@xavierleroy
Copy link
Contributor

What about adding Config to the standard library?

I worry about the bootstrap story. So, we will have two possibly different Config, one in boot/ and one in stdlib/. Which one is used for which purposes?

@damiendoligez
Copy link
Member

There can be only one version of config.ml. It is generated at configure time and doesn't change afterwards. It may get compiled once to build the compiler and once to build the stdlib but the contents have to match.

@christoph-cullmann
Copy link
Contributor

Is the "frame pointers" variant of ocamlopt still relevant?

Unfortunately the performance of "perf" using DWARF call-frame information is terrible. Since reading the DWARF live is too slow it basically memcpys the entire stack into the profile and decodes it later. This leads to extremely large profiles and needing to use much lower sampling rates. As upsetting as it is to dynamically calculate statically known data, compiling with frame pointers remains the best way to profile OCaml programs. One day I hope that perf will be able to take backtraces based on a static description of the stack -- for example linux has the ORC project to solve basically the same problem -- but it doesn't look like happening any time soon.

Just to confirm this statement: Yes, as stated above, the FP variant is the only way to be able to profile long running stuff with perf in a reasonable way (without massive overhead).

We at AbsInt use this feature extensively for large analyses that runs for days, for the OCaml and the C++ parts, for both only the FP variant is viable. All other provided ways to reconstruct the call-stack either are too slow (DWARF) or don't work reliable for deep stacks (lbr).

I think my only real contribution to this repository was one patch to fix some FP related issue that hindered our use of it ;=)

@nojb
Copy link
Contributor

nojb commented Oct 25, 2020

FWIW I also think that adding frame_pointers to Sys sounds iffy. What about doing what @damiendoligez suggests, namely adding Config to the stdlib? That is, if @xavierleroy is satisfied that this would not complicate bootstrapping.

@xavierleroy
Copy link
Contributor

That is, if @xavierleroy is satisfied that this would not complicate bootstrapping.

I still don't understand the bootstrapping implications. There will be one copy of the Config unit in boot/ , used to build ocamlc, and one copy of Config in stdlib/ , used to build ocamlopt, and both copies will differ when entries are added or removed to Config, or change meaning. What will happen then?

@lthls
Copy link
Contributor

lthls commented Oct 26, 2020

I'm not entirely sure what making config into a small library will entail. Any chance one of the proponents of this approach could make a PR? I'd quite like to have a solution to this.

I've opened #9992 which shows one way it could be done.

@gasche
Copy link
Member

gasche commented May 12, 2021

What is the status on this PR? My impression is that there is no consensus on whether we want it.

@lpw25
Copy link
Contributor Author

lpw25 commented May 12, 2021

I still want a solution to this, and remain of the opinion that Sys is the most logical place to find the information. I would also be fine with other solutions, e.g. making Config available in the standard library, if people can resolve the issues with doing that.

@dra27
Copy link
Member

dra27 commented May 12, 2021

At the risk of appearing to go back on my original comment about having caml_sys_frame_pointers, can this be solved quite simply by having Config.frame_pointers as already in this PR and having a lambda primitive %frame_pointers (cf. %big_endian, etc.) and so forth? The point being you use a C primitive (e.g. caml_sys_get_config) to extract things from the runtime and primitives in lambda to extract things from the compiler (which is what Config.frame_pointers is). A side-effect of having it as a lambda primitive is that there'd then be no need to have it in Sys - you can just declare the external in whatever program wants to use it?

@dra27
Copy link
Member

dra27 commented May 12, 2021

(TL;DR instead of "Add Sys.frame_pointers" do "Add %frame_pointers")

@damiendoligez
Copy link
Member

(TL;DR instead of "Add Sys.frame_pointers" do "Add %frame_pointers")

This looks like a good idea and the most lightweight of all the proposals so far. It also goes well with the idea that this is an obscure feature that very few people will need.

@dra27 dra27 mentioned this pull request May 19, 2021
@dra27
Copy link
Member

dra27 commented May 20, 2021

#10419's merged, so I think this can be closed.

@dra27 dra27 closed this May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet