Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to be able to reliably get symbol addrs #520

Open
jswrenn opened this issue Apr 7, 2023 · 16 comments
Open

Need to be able to reliably get symbol addrs #520

jswrenn opened this issue Apr 7, 2023 · 16 comments
Labels

Comments

@jswrenn
Copy link
Member

jswrenn commented Apr 7, 2023

The documentation for Frame::symbol_address warns:

This will attempt to rewind the instruction pointer returned by ip to the start of the function, returning that value. In some cases, however, backends will just return ip from this function.

Consequently, the following code 'works' on x86_64-unknown-linux-gnu, but not on aarch64-apple-darwin:

use backtrace;
use std::{hint::black_box, ptr, ffi::c_void};

fn main() {
    black_box(function());
}

#[inline(never)]
fn function() {
    let function = function as *const c_void;
    println!("searching for symbol_address={:?}", function);

    backtrace::trace(|frame| {
        println!("unwound to {:?}", frame);
        if ptr::eq(frame.symbol_address(), function) {
            println!("found it!"); // not reached on aarch64-apple-darwin :(
            return false;
        }
        true
    });
}

Is this expected behavior on this platform? If so, is there any way to work around this discrepancy?

In the scoped-trace crate, I use symbol address equality to capture backtraces with limited upper and lower unwinding bounds. I'm hoping to get this crate working on aarch64-apple-darwin.

@bjorn3
Copy link
Member

bjorn3 commented Apr 7, 2023

On macOS the function to get the address of the enclosing function of an ip address (_Unwind_FindEnclosingFunction) is unreliable due to compact unwind info collapsing multiole functions with identical unwind info together:

// The macOS linker emits a "compact" unwind table that only includes an
If the executable is not stripped you can try parsing the executable itself using eg the object crate and finding the last symbol before the ip address.

@jswrenn
Copy link
Member Author

jswrenn commented Apr 13, 2023

If the executable is not stripped you can try parsing the executable itself using eg the object crate and finding the last symbol before the ip address.

That's not too bad! Would backtrace-rs accept a PR implementing this?

@philipc
Copy link
Contributor

philipc commented Apr 14, 2023

The symbolization already does that. I wonder why Symbol::addr doesn't return the address for symtab entries.

@jswrenn
Copy link
Member Author

jswrenn commented Apr 14, 2023

If I had to guess, it's because it looks like backtrace-rs currently uses information from DWARF xor symtab entries — not both. In a situation where DWARF debuginfo was completely unavailable, Frame::symbol_address might behave as expected.

@philipc
Copy link
Contributor

philipc commented Apr 15, 2023

backtrace-rs falls back to symtab entries if it can't find a DWARF entry. But both of those are only used in the symbolizer. Frame::symbol_address only uses the unwinder. It doesn't and shouldn't use DWARF or symbol table entries. You need to resolve the frame if you want to use those.

@workingjubilee
Copy link
Contributor

It seems like everything is working as intended, then? Shall we close this?

@jswrenn
Copy link
Member Author

jswrenn commented Jun 22, 2023

@workingjubilee Maaaybe? The comment here:

// The macOS linker emits a "compact" unwind table that only includes an
// entry for a function if that function either has an LSDA or its
// encoding differs from that of the previous entry. Consequently, on
// macOS, `_Unwind_FindEnclosingFunction` is unreliable (it can return a
// pointer to some totally unrelated function). Instead, we just always
// return the ip.
//
// https://github.com/rust-lang/rust/issues/74771#issuecomment-664056788
//
// Note the `skip_inner_frames.rs` test is skipped on macOS due to this
// clause, and if this is fixed that test in theory can be run on macOS!

...uses the phrase "if this is fixed" — which suggests that something unwelcome (albeit not unknown) is happening here.

Could we document this shortcoming? Or even make it explicit in the API by making ip an Option? Could we even eliminate this shortcoming? E.g.:

  • can compact unwinding be disabled?
  • can the DWARF unwinding tables be used instead?

I almost would rather if backtrace-rs used the unreliable output _Unwind_FindEnclosingFunction — then at least symbol_address would produce sometimes useful results on macOS, rather than always-useless (i.e., not more useful than ip) results.

@bjorn3
Copy link
Member

bjorn3 commented Jun 22, 2023

Compact unwinding can be disabled when linking a binary or library, but when compact unwinding was enabled when linking (as is done for all system libraries and by default for user code), there are no DWARF unwinding tables remaining.

I almost would rather if backtrace-rs used the unreliable output _Unwind_FindEnclosingFunction — then at least ip would produce sometimes useful results on macOS, rather than always-useless (i.e., not more useful than sp) results.

I did expect the current output to be useful for looking up in the symbol table which should always give the correct result if the symbol table exists at all. The result of _Unwind_FindEnclosingFunction may result in the wrong function without any option to get the correct result using the symbol table.

@jswrenn
Copy link
Member Author

jswrenn commented Jun 22, 2023

(Whoops, edited my last comment because I got my function names mixed up.)

I did expect the current output to be useful for looking up in the symbol table which should always give the correct result if the symbol table exists at all.

Am I right to think that you could instead use ip in this case?

Alternatively, could backtrace-rs do that look-up into the symbol table?

@bjorn3
Copy link
Member

bjorn3 commented Jun 22, 2023

Am I right to think that you could instead use ip in this case?

Right, you could.

Alternatively, could backtrace-rs do that look-up into the symbol table?

I think that would make sense.

@workingjubilee
Copy link
Contributor

The result of _Unwind_FindEnclosingFunction may result in the wrong function without any option to get the correct result using the symbol table.

Yeah, I think that kills the idea of using that on macOS dead. Any guess that might be wrong seems like it kinda breaks with what symbol_address says it does: it says it rewinds to the start of the function (implicit: correctly) or stays equal to ip, allowing you to detect which happens. It's better to simply return a value equal to ip if we're not going to produce a guaranteed-useful answer.

Regarding doing the table lookup implicitly, I don't think it should be completely off the (heh) table, but I'm slightly concerned about, and would like to hear an elaboration on, @philipc's perspective, namely:

It doesn't and shouldn't use DWARF or symbol table entries. You need to resolve the frame if you want to use those.

I can guess why this was said, but it's likely there's a nuance that hasn't been stated explicitly and that might be missing from the conversation so far.

@philipc
Copy link
Contributor

philipc commented Jun 23, 2023

I don't see any technical reason why the unwinder couldn't use the symbol table, but from a design perspective, this is something that the resolver is intended to do and already has code for, so I don't think it should be duplicated in the unwinder. I haven't seen a reason why the resolver can't be used in this case, but I haven't looked into the motivating use case (scoped-trace) at all.

@philipc
Copy link
Contributor

philipc commented Jun 23, 2023

While I think the resolver should be used for this purpose, I don't think it works correctly currently. Symbol::addr is documented to return the starting address of the function, and appears to do this for dbghelp, but it returns the unrelocated IP minus one for DWARF, and None for symbol tables.

@workingjubilee
Copy link
Contributor

workingjubilee commented Jun 23, 2023

Returning None seems okay, at least, in the sense that it's useless but not wrong. But the DWARF response seems simply incorrect.

@workingjubilee workingjubilee changed the title Platform-dependent behavior of Frame::symbol_address. Need to be able to reliably get symbol addrs Jun 23, 2023
@workingjubilee
Copy link
Contributor

workingjubilee commented Jun 23, 2023

This issue is no longer about Frame::symbol_address, which should probably remain untouched. Rather, it is about having a function that answers the desired use-case at all, is correct across platforms, and tries its alternatives until it succeeds or fails.

@jswrenn
Copy link
Member Author

jswrenn commented Jun 23, 2023

Yes, that sounds great. Again, for context: In the scoped-trace crate, I use symbol address equality to capture backtraces with limited upper and lower unwinding bounds. So I'd like to be able to call this function without doing full symbol resolution, or in situation where only symbol tables are available and not full DWARF debuginfo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants