Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"unknown location" in backtraces in ocaml script #9701

Closed
serpent7776 opened this issue Jun 23, 2020 · 15 comments · Fixed by #10803
Closed

"unknown location" in backtraces in ocaml script #9701

serpent7776 opened this issue Jun 23, 2020 · 15 comments · Fixed by #10803
Labels

Comments

@serpent7776
Copy link
Contributor

I'm trying to get backtraces in simple test code:

let f () = failwith "test"
let proc () = f ()
let () = proc ()

Running env OCAMLRUNPARAM=b ocaml test.ml:

ocaml --version
The OCaml toplevel, version 4.10.0
Exception: Failure "test".
Raised at file "stdlib.ml", line 29, characters 22-33
Called from unknown location
Called from file "toplevel/toploop.ml", line 212, characters 17-27

Notice the "unknown location"

But for older version location in source file was provided

ocaml --version
The OCaml toplevel, version 4.05.0
Exception: Failure "test".
Raised at file "pervasives.ml", line 32, characters 22-33
Called from file "./test.ml", line 3, characters 9-16
Called from file "toplevel/toploop.ml", line 180, characters 17-56

discuss thread for reference: https://discuss.ocaml.org/t/backtraces-in-script-run-by-ocaml/5970

@nojb
Copy link
Contributor

nojb commented Jun 23, 2020

The culprit is 3efba04 @trefis

@nojb nojb added the bug label Jun 23, 2020
@gasche
Copy link
Member

gasche commented Jun 23, 2020

I will add a regression testcase in the testsuite so that we track ocaml-script behavior with respect to backtraces.

gasche added a commit that referenced this issue Jun 23, 2020
gasche added a commit that referenced this issue Jun 23, 2020
This reverts commit f0fae66.

This new test appears to break something on the CI, possibly when
flambda is used:

>  ... testing 'toplevel_script_backtrace.ml' with 1 (toplevel) =>
> failed (Running toplevel_script_backtrace.ml in bytecode toplevel
> (expected exit status: 2): command
> /home/barsac/ci/builds/workspace/extra-checks/runtime/ocamlrun
> /home/barsac/ci/builds/workspace/extra-checks/ocaml -noinit -no-version -noprompt -nostdlib -I
> /home/barsac/ci/builds/workspace/extra-checks/stdlib -I
> /home/barsac/ci/builds/workspace/extra-checks/toplevel
> toplevel_script_backtrace.ml failed with exit code 1)
gasche added a commit to gasche/ocaml that referenced this issue Jun 23, 2020
(no change entry required)
@gasche
Copy link
Member

gasche commented Sep 2, 2020

Pinging prospective bug-fixers: this is an annoying regression that was introduced somewhat recently.

@trefis
Copy link
Contributor

trefis commented Sep 2, 2020

The culprit is 3efba04 @trefis

If that's the case then I believe the following patch should be enough to fix the issue:

diff --git a/toplevel/toploop.ml b/toplevel/toploop.ml
index 5e5fc436d..8de0347d6 100644
--- a/toplevel/toploop.ml
+++ b/toplevel/toploop.ml
@@ -639,6 +639,7 @@ let run_script ppf name args =
       Location.report_exception ppf exn; exit 2
   end;
   Sys.interactive := false;
+  Clflags.debug := true;
   run_hooks After_setup;
   let explicit_name =
     (* Prevent use_silently from searching in the path. *)

however it doesn't appear to work for me.
Neither does reverting the commit you point to.

So perhaps I'm doing something wrong when trying to reproduce, or there is another issue. I do think the patch I wrote above is needed though. I'll let someone else try it.

@nojb
Copy link
Contributor

nojb commented Sep 5, 2020

I double checked, this is what I obtained:

In short: @trefis' suggested fix indeed fixes the problem when applied to the commit that introduced the issue (3efba04), but is no longer enough to fix the issue when applied to trunk.

@aryx
Copy link

aryx commented Apr 28, 2021

Is this fixed somewhere?
I recently switched from 4.09.1 to 4.11.2 and now get only

Uncaught exception:
  
  (Failure "no language specified; use -lang")

Called from unknown location
Called from unknown location
Called from unknown location
Called from unknown location
Called from unknown location
Called from unknown location
Called from unknown location

where I used to have nice location.

@aryx
Copy link

aryx commented Apr 28, 2021

This is in compiled code, not even in script. I tried OCAMLRUNPARAM=b, and Printexc.record_backtrace true in my main.ml but still get only unknown location errors.

@aryx
Copy link

aryx commented Apr 28, 2021

Here is the same backtrace with 4.09.1

Uncaught exception:
  
  (Failure "no language specified; use -lang")

Raised at file "stdlib.ml", line 29, characters 22-33
Called from file "src/cli/Main.ml", line 1479, characters 25-45
Called from file "src/pfff/commons/Common.ml", line 97, characters 14-18
Re-raised at file "src/pfff/commons/Common.ml", line 102, characters 10-11
Called from file "src/pfff/commons/Common.ml", line 1375, characters 12-16
Re-raised at file "src/pfff/commons/Common.ml", line 1373, characters 8-301
Called from file "src/pfff/commons/Common.ml", line 1343, characters 45-49
Called from file "src/pfff/commons/Common.ml", line 97, characters 14-18
Re-raised at file "src/pfff/commons/Common.ml", line 102, characters 10-11
Called from file "src/pfff/commons/Common.ml", line 1340, characters 6-10
Re-raised at file "src/pfff/commons/Common.ml", line 1314, characters 43-54
Called from file "src/cli/Main.ml", line 1500, characters 2-145

@gasche
Copy link
Member

gasche commented Apr 28, 2021

@aryx if the ocaml toplevel is not involved in your case, then it is a different issue. As far as I know, backtraces in general do work okay these days. Could you provide a reproduction case, what precise steps do I need to do to reproduce your issue on my machine?

@aryx
Copy link

aryx commented Jul 15, 2021

Ok, finally figure out what was going on ... It's because I was compiling my project with (modes byte) in the dune file,
and it's producing a Main.bc and Main.exe, but the Main.exe is actually a bytecode program with ocamlrun included in it,
and in that case I get all those "Called from unknown location"
Switching back to a real Main.exe by using (modes byte exe) solved the issue.

@aryx
Copy link

aryx commented Jul 15, 2021

Also I just discovered I was shipping bytecode version of my programs all this time ... I got fooled by the presence
of a Main.bc and Main.exe; I thought the .exe was the native version.

@renatoalencar
Copy link
Contributor

I've been tracking this down since yesterday, and although debug was not enabled when running script files what is actually happening is that the bytecode for the script is being released before the actual backtrace is recorded. Which results on that unknown location being showed instead of the actual location of the backtrace slot.

https://github.com/ocaml/ocaml/blob/trunk/toplevel/byte/topeval.ml#L87-L99

  match
    may_trace := true;
    Fun.protect
      ~finally:(fun () -> may_trace := false;
                          if can_free then Meta.release_bytecode bytecode) (* Debugging info gets released together with bc *)
      closure
  with
  | retval -> Result retval
  | exception x ->
    record_backtrace (); (* Being called after debugging information is released *)
    toplevel_value_bindings := initial_bindings; (* PR#6211 *)
    Symtable.restore_state initial_symtable;
    Exception x

Releasing bytecode removes debug information from the caml_debug_info table for that particular code segment, and then record_backtrace calls Printexc.get_backtrace -> Printexc.convert_raw_backtrace -> caml_convert_raw_backtrace -> caml_convert_debuginfo -> caml_debuginfo_location -> event_for_location -> find_debug_info, which doesn't find debug information for the script bytecode that was released.

https://github.com/ocaml/ocaml/blob/trunk/runtime/meta.c#L148

CAMLprim value caml_static_release_bytecode(value bc)
{
  code_t prog;
  struct code_fragment *cf;

  prog = Bytecode_val(bc)->prog;
  caml_remove_debug_info(prog);

It works after applying the following patch, but returns an unknown definition on the stack trace.

diff --git a/toplevel/byte/topeval.ml b/toplevel/byte/topeval.ml
index 2132b4c60..796eaa9e1 100644
--- a/toplevel/byte/topeval.ml
+++ b/toplevel/byte/topeval.ml
@@ -87,13 +87,16 @@ let load_lambda ppf lam =
   match
     may_trace := true;
     Fun.protect
-      ~finally:(fun () -> may_trace := false;
-                          if can_free then Meta.release_bytecode bytecode)
+      ~finally:(fun () -> may_trace := false)
       closure
   with
-  | retval -> Result retval
+  | retval ->
+    if can_free then Meta.release_bytecode bytecode;
+    Result retval
   | exception x ->
     record_backtrace ();
+    if can_free then Meta.release_bytecode bytecode;
+
     toplevel_value_bindings := initial_bindings; (* PR#6211 *)
     Symtable.restore_state initial_symtable;
     Exception x
diff --git a/toplevel/toploop.ml b/toplevel/toploop.ml
index 62a5b0023..87f97c7cd 100644
--- a/toplevel/toploop.ml
+++ b/toplevel/toploop.ml
@@ -109,6 +109,7 @@ let load_file = load_file false
 (* Execute a script.  If [name] is "", read the script from stdin. *)
 
 let run_script ppf name args =
+  Clflags.debug := true;
   override_sys_argv args;
   let filename = filename_of_input name in
   Compmisc.init_path ~dir:(Filename.dirname filename) ();

Result for the test case reproduced above

Exception: Failure "test".
Raised at Stdlib.failwith in file "stdlib.ml", line 29, characters 17-33
Called from <unknown> in file "./example2.ml", line 3, characters 9-16
Called from Stdlib__Fun.protect in file "fun.ml", line 33, characters 8-15

I've pushed that here as a proof of concept renatoalencar@a096eed.

@gasche
Copy link
Member

gasche commented Dec 1, 2021

@renatoalencar your analysis seems reasonable to me; it looks like the bug you describe is fairly recent, introduced by #9855 which was merged in 4.12. Would you like to propose a Pull Request with your fix?

Three remarks:

  • I'm not sure why the function is reported as (cc @stedolan?)
  • There is a similar-looking code pattern in toplevel/native/tophooks.ml (in trunk), with Sys.remove_dll instead of release_meta, that may also be worth changing.
  • I think that rewriting this code without using Fun.protect may in fact be more systematic and readable.

@renatoalencar
Copy link
Contributor

Definitely. And about Sys.remove dll, OCAMLRUNPARAM=b doesn't present a stack trace for ocamlnat and it doesn't have -g flag. How do I do that for ocamlnat?

@gasche
Copy link
Member

gasche commented Dec 1, 2021

I would not try to test ocamlnat for this (it's in flux and not installed to users anyway), but I would still change the code to ensure that the dll is deleted only after the backtrace is recorded. (Not sure if it makes a difference, but it seems safer to avoid the "dangerous" pattern consistently in the codebase.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants