Expose Code Section Offset #190

mitsuhiko · 2020-11-24T14:32:10Z

Motivation

Currently you cannot learn about the in-WASM file offset of the code section. This is however necessary to make DWARF work as offsets in DWARF files are relative within the code section whereas back traces by browsers provide the absolute offset within the WASM file. To calculate the difference the offset of the code section is required.

Currently the only way to get this information appears to parse the file a second time with wasmparser:

let mut code_offset = 0;
for payload in wasmparser::Parser::new(0).parse_all(wasm) {
    if let Ok(wasmparser::Payload::CodeSectionStart { range, .. }) = payload {
        code_offset = range.start as u64;
        break;
    }
}

Proposed Solution

Keep track of the CodeSectionStart's range start in the module.

Alternatives

Parse again as we're doing currently.

Additional Context

Some notes on why we need this can be found here: getsentry/symbolic#290

The text was updated successfully, but these errors were encountered:

alexcrichton · 2020-11-25T18:22:00Z

This crate in general is intended for wasm transformations, but would y'all's use case fall moreso into simply parsing? For something like that building on wasmparser is probably a better route than building on walrus

mitsuhiko · 2020-11-29T10:50:46Z

I first started using wasmparser directly but I ended up reimplementing a ton of walrus in the process. I do agree though that the nature of how walrus is working makes it unclear if the code offsets carries a meaning. For instance to solve #67 the code offset does not play a role.

I think it's fine to close this issue, that said, walrus is currently one of the most convenient crates to work with WASM because it provides an object representing the module and the only other crate in the ecosystem that is similar I'm aware of is parity-wasm which is throwing away even more info which makes it hard to use for working with debug data.

alexcrichton · 2020-11-30T15:37:06Z

Ah yeah unfortunately there aren't a ton of crates for representing "here's an API to the wasm module you just parsed". That sort of use though tends to be pretty context-specific and it's difficult to make "one API to rule them all" really. The wasmparser crate is intended to at least be the shared support for parsing the wasm file, but as you've found it's very low level. Walrus is more geared towards wasm-to-wasm transformations (e.g. wasm-bindgen) and isn't intended so much for inspection of a wasm file. For example we don't have the original code offset for all functions because functions can also be created on the fly.

That's not to say though that we couldn't add all this information to walrus. If it's the case that walrus is the best fit in the ecosystem for now (even if it's not the best fit for your use case theoretically), it seems fine to at least provide the information!

I'd personally be totally fine with a PR to add "source information" to things like functions which would be optional (for created functions), but present on all decoded functions.

mitsuhiko added the enhancement New feature or request label Nov 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose Code Section Offset #190

Expose Code Section Offset #190

mitsuhiko commented Nov 24, 2020 •

edited

alexcrichton commented Nov 25, 2020

mitsuhiko commented Nov 29, 2020

alexcrichton commented Nov 30, 2020

Expose Code Section Offset #190

Expose Code Section Offset #190

Comments

mitsuhiko commented Nov 24, 2020 • edited

Motivation

Proposed Solution

Alternatives

Additional Context

alexcrichton commented Nov 25, 2020

mitsuhiko commented Nov 29, 2020

alexcrichton commented Nov 30, 2020

mitsuhiko commented Nov 24, 2020 •

edited