Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Code Section Offset #190

Open
mitsuhiko opened this issue Nov 24, 2020 · 3 comments
Open

Expose Code Section Offset #190

mitsuhiko opened this issue Nov 24, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@mitsuhiko
Copy link

mitsuhiko commented Nov 24, 2020

Motivation

Currently you cannot learn about the in-WASM file offset of the code section. This is however necessary to make DWARF work as offsets in DWARF files are relative within the code section whereas back traces by browsers provide the absolute offset within the WASM file. To calculate the difference the offset of the code section is required.

Currently the only way to get this information appears to parse the file a second time with wasmparser:

let mut code_offset = 0;
for payload in wasmparser::Parser::new(0).parse_all(wasm) {
    if let Ok(wasmparser::Payload::CodeSectionStart { range, .. }) = payload {
        code_offset = range.start as u64;
        break;
    }
}

Proposed Solution

Keep track of the CodeSectionStart's range start in the module.

Alternatives

Parse again as we're doing currently.

Additional Context

Some notes on why we need this can be found here: getsentry/symbolic#290

@mitsuhiko mitsuhiko added the enhancement New feature or request label Nov 24, 2020
@alexcrichton
Copy link
Collaborator

This crate in general is intended for wasm transformations, but would y'all's use case fall moreso into simply parsing? For something like that building on wasmparser is probably a better route than building on walrus

@mitsuhiko
Copy link
Author

I first started using wasmparser directly but I ended up reimplementing a ton of walrus in the process. I do agree though that the nature of how walrus is working makes it unclear if the code offsets carries a meaning. For instance to solve #67 the code offset does not play a role.

I think it's fine to close this issue, that said, walrus is currently one of the most convenient crates to work with WASM because it provides an object representing the module and the only other crate in the ecosystem that is similar I'm aware of is parity-wasm which is throwing away even more info which makes it hard to use for working with debug data.

@alexcrichton
Copy link
Collaborator

Ah yeah unfortunately there aren't a ton of crates for representing "here's an API to the wasm module you just parsed". That sort of use though tends to be pretty context-specific and it's difficult to make "one API to rule them all" really. The wasmparser crate is intended to at least be the shared support for parsing the wasm file, but as you've found it's very low level. Walrus is more geared towards wasm-to-wasm transformations (e.g. wasm-bindgen) and isn't intended so much for inspection of a wasm file. For example we don't have the original code offset for all functions because functions can also be created on the fly.

That's not to say though that we couldn't add all this information to walrus. If it's the case that walrus is the best fit in the ecosystem for now (even if it's not the best fit for your use case theoretically), it seems fine to at least provide the information!

I'd personally be totally fine with a PR to add "source information" to things like functions which would be optional (for created functions), but present on all decoded functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants