Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GC for references and hold parameter references over async await #3743

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

pando-fredrik
Copy link

The PR ( #3562 ) removed the wrapper generation which was also responsible for registering to the JS FinalizationRegistry to garbage collect to run the clean up code when a GC occurred.

This PR reenables the registration to allow reference types to properly garbage collect, however there was also an issue where objects was garbage collected (wasm-bindgen 0.2.87 and below) where an WASM object if passed as a parameter to an asynchronous method was garbage collected while it was awaited, 0.2.88/0.2.89 does not exhibit this problem as the garbage collection was broken (and thus leaks memory instead), this PR introduces a small runtime wrapper for asynchronous calls that makes sure the variables stay in scope during the function call.

The code generation is hard to follow so I'm not confident this is the preferred solution and are open to changes and pointers to improve the patch.

@pando-fredrik
Copy link
Author

Unfortunately I'm not even able to start the test suite that fails as it crashes on my computers (even on the main branch). Pointers would be welcome.

@daxpedda daxpedda added the needs review Needs a review by a maintainer. label Dec 14, 2023
@pando-fredrik
Copy link
Author

Update, I eventually found I missed to set WASM_BINDGEN_SPLIT_LINKED_MODULES before starting the tests.

And I also noticed that mut refs break with this patch (which the tests correctly identified, although with a rather vague error).

The reason is identified (but not yet solved), the issue seems to be that in some circumstances (I believe when mut refs are used an accessor method is injected to return the actual underlying object which breaks with this patch in its current shape or form)

I still hope someone can chime in on a way better solution to this (perhaps the PR should be split too as it tries to solve two problems, the finalization registration alone seems to work fine; but leaves our original problem, which was the sole reason
I started looking into this and noticed the broken GC- as we were using wasm-bindgen 0.2.87 and after upgrade to 0.2.89 the GC problem was gone, which we found was because objects weren't simply wasnt GCed)

Comment on lines +1313 to +1340
if asyncness && !args.is_empty() && cx.config.weak_refs {
let transformed_args = self.transform_this_to_that_args(args);
let roots_objects = self.find_root_objects(&transformed_args);

let unique_roots: Vec<String> = roots_objects
.into_iter()
.collect::<HashSet<String>>()
.into_iter()
.collect();
prelude.push_str("let that = this;\n");
let wrapper = format!(
r#"
(function () {{
return new Promise((resolve, reject) => {{
let uniqueRoots = [{}];
if(uniqueRoots.length) {{
that.__paramRefs = that.__paramRefs || {{}};
that.__paramRefs[that.__wbg_ptr] = unique_roots;
}}
wasm.{}({}).then(resolve).catch(reject).finally(() => {{
delete that.__paramRefs[that.__wbg_ptr];
}});
}})}})()"#,
unique_roots.join(","),
name,
transformed_args.join(", ")
);
Ok(wrapper)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the right place to put this, because it's too low-level: all of the arguments have already been converted to numbers so that they can be passed to Wasm, and the return type of wasm.{} is a number as well. This means that:

  • Storing the arguments doesn't accomplish anything since it's just a list of numbers.
  • Calling .then(), etc. won't work since it isn't a promise yet, just a number.

The proper way to do this would be to add a new Descriptor for futures, since they're no longer treated the same as regular JsValues and should be a separate type. Then the JS that creates promises from raw numbers would be the same as the JS that creates JsValues from raw numbers, except with this tacked on the end.

A different way to accomplish this would be to add reference counting inside Rust. Then, when JavaScript's handle to the Rust struct gets GC'd, it doesn't actually free the struct yet if Rust is still using it; it would just decrement the reference count from 2 to 1. Then when Rust is done with it, it'd go from 1 to 0 and the struct would actually be freed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're probably right (and especially in the mut ref case), I fail to see what you're talking about regarding arguments being numbers (when compiled with external refs)

For example

The unmodified generated glue code looks like this; (params is the object that gets garbage collected)

  request(params) {
      if (this.__wbg_ptr == 0) throw new Error('Attempt to use a moved value');
      _assertNum(this.__wbg_ptr);
      _assertClass(params, RequestParamObject);
      if (params.__wbg_ptr === 0) {
          throw new Error('Attempt to use a moved value');
      }
      const ret = wasm.context_request(this.__wbg_ptr, params.__wbg_ptr);
      return ret;
  } 

While the patched version generates this- So "params" in this case is a javascript object (although a Rust class in this case) (and so is this since the PR referred to in the PRs original description), notice the retained object is the root object not the __wbg_ptr etc that are indeed numeric, yes you're correct the return value is not a promise when at least the function signature is mut ref (it generates an additional getObject() accessor that does not appear to occur in other cases)

  request(params) {
     _assertClass(params, RequestParamObject);
     let that = this;
     const ret =
         (function() {
             return new Promise((resolve, reject) => {
                 that.__paramRefs = that._paramRefs || {};
                 that.__paramRefs[that.__wbg_ptr] = [that, params];
                 wasm.context_request(that.__wbg_ptr, params.__wbg_ptr).then(resolve).catch(reject).finally(() => {
                     delete that.__paramRefs[that.__wbg_ptr];
                 });
             })
         })();
     return ret;
  }

As for the suggestion to use reference counting it could work as this is the glue code (but I'm bit concerned about dependency cycles, unless the counting is done here as well),

However perhaps I was unclear exactly when and why this is an issue with the current wasm-bindgen.

Given the classes above RequestParamObject and Context (with the async method request)

This will work as expected with the current (non patched implementation, except it doesn't garbage collect the rust heap for RequestParamObject when it goes out of scope with 0.2.88/0.2.89)

async function performRequest(context, id) {
   const req = new RequestParamObject(id);
   await context.request(req);  
}
...
await performRequest(someContext, someId);
...

However if the code is written like this (which it unfortunately is in some third party applications that use our library)

async function performRequest(context, id) {
   const req = new RequestParamObject(id);
   return context.request(req)  
} 
await performRequest(someContext, someId); <-- req went out of scope when performRequest returned and the JS GC _could_ (if the browser decides to GC) 

Just to clarify; it's the "req" object that will be registered to the FinalizationRegistry and is the object that will be garbage collected early (which in the current implementation results in the Drop of the RequestParamObject is invoked and triggers the WasmRefCells reference counting as the "performRequest" function has not yet returned and this has an outstanding non mutable access to the object, while the Drop requests a mutable reference which triggers the "rust does not allow aliasing" exception to be throw.

But are you confident this can be resolved using a Descriptor for futures?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see what you're talking about regarding arguments being numbers (when compiled with external refs)

Ah, sorry, I missed that you strip the .__wbg_ptr inside find_root_objects. My bad. I'm... not particularly a fan of using string manipulation like that, it feels a bit brittle to me if e.g. the pointer gets moved into a variable for whatever reason, but I'll ignore that for now.

yes you're correct the return value is not a promise when at least the function signature is mut ref (it generates an additional getObject() accessor that does not appear to occur in other cases)

I didn't realise you were compiling with reference types enabled; that's probably why you didn't notice this. When it's disabled, getObject() is always used, and so the return value is never already a promise.

As for the suggestion to use reference counting it could work as this is the glue code (but I'm bit concerned about dependency cycles, unless the counting is done here as well),

I don't think it should be possible to create dependency cycles. What I was proposing is to change the way exported Rust structs are stored in Rust from a Box<WasmRefCell<T>> to an Rc<WasmRefCell<T>>. Since that Rc is completely private, I don't think there's any way someone could get access to it and include it inside T to create a cycle.

It would be possible for it to contain a JsValue of its own JS wrapper, stopping it from being GC'd; but you can already do that today, that's nothing new.

But are you confident this can be resolved using a Descriptor for futures?

Yeah; to be honest though I think reference-counting is a better solution, and should be significantly simpler to implement.

I wasn't very clear with what I meant by 'add a Descriptor' though, so let me elaborate on that a bit.

Right now, the list of Instructions generated for an async function looks something like this:

[
    ...
    Instruction::CallExport(...),
    Instruction::ExternrefLoadOwned(...),
]

Right now the anti-GC code is being generated inside CallExport, when we want it to be called after ExternrefLoadOwned. So, we need to add some extra instructions onto the end when the function is async:

[
    ...
    Instruction::CallExport(...),
    Instruction::ExternrefLoadOwned(...),
    // Get a handle to the original arguments to the function (before they get turned into numbers); this avoids having to do any string manipulation.
    // It should be possible to only fetch the arguments which are actually Rust structs here by checking if they're of type `AdapterType::Struct`.
    Instruction::ArgGet(0),
    Instruction::ArgGet(1),
    ...
    // A new instruction which adds on a `.finally` that references all the args we just got via. `ArgGet`.
    Instruction::SaveArgs(...),
]

But then, we need some way to communicate to the function that generates instructions (Context::register_export_adapter) that the function it's processing is async, so that it can add those instructions. This is why I was suggesting to add a Descriptor; as a way of communicating that, which fits within the existing infrastructure and doesn't require e.g. threading an asyncness parameter down to it.

Again though I think it's probably a better idea to do reference counting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed explaination, I'll look into the Descriptor approach (and I'll explain why below) I understand and would also prefer making the wasmrefcell reference counted however I believe we're trying to solve two different problems here.

So let me reiterate once again.

  1. WasmObjects are not garbage collected if compiled with external refs (This could easily be split out to a separate PR and the situation would be as it was in 0.2.87)
  2. WasmObjects are garbage collected early if certain (valid and common) javascript patterns are used, this could be solved via reference counting as suggested)
  3. What I think I did not convey is that the "garbage collected early" is affects javascript objects as well (it's the fact that the JS GC garbage collects the object as it goes out of scope from it's point of view this issue surface to start with, it just so much more obvious when rust objects are involved as it triggers the mut/non mut aliasing check), It's also why the mechanics of the current PR is implemented as is, and why I believe the Descriptor approach is the only feasible soultion (as it would be analogue with the current solution just implemented in a better way)

So while I agree that ref counting on the wasm side would be a clean and relatively simple solution I don't see how it would eliminate the issue (unless you make your API so it may only accept wasm objects as input parameters to asynch methods if compiling with external-refs of course, but that would be rather intrusive I think?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What I think I did not convey is that the "garbage collected early" is affects javascript objects as well (it's the fact that the JS GC garbage collects the object as it goes out of scope from it's point of view this issue surface to start with, it just so much more obvious when rust objects are involved as it triggers the mut/non mut aliasing check), It's also why the mechanics of the current PR is implemented as is, and why I believe the Descriptor approach is the only feasible soultion (as it would be analogue with the current solution just implemented in a better way)

Huh, that's surprising. I assume by 'javascript objects' you mean JsValues and other imported JS types?

I wouldn't have thought that was possible, since all the JS values Rust has access to are kept in a global array (heap), which should stop them getting GC'd. I'd appreciate it if you could describe how to reproduce this so I can figure out what's going on, but if that's true then I agree that the Descriptor approach makes more sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all happy holidays, I apoligize for the late reply.

I'll retract my statement above and can only conclude that I must have confused myself when I was troubleshooting the issue.

I now made a standalone project (although not uploaded it) but you are absolutely correct, I'm not able to reproduce the problem with JsValues (external refs enabled), I don't see the JsValue being stored anywhere in the glue code though but it seems to generate a closure wrapper that does retain the reference.

So anyway either I confused myself and believed I saw something that never happened, or it's some special use case that I don't know how to get wasm-bindgen to generate in my test project- I'll disregard this for now and follow your other suggestion instead to make WasmRefCell reference counted instead. Thanks for your patience and detailed explaination, suggestions and ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for author Waiting for author to respond
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants