Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of Oilpan / V8 C++ Garbage Collector #40786

Open
mcollina opened this issue Nov 11, 2021 · 27 comments
Open

Usage of Oilpan / V8 C++ Garbage Collector #40786

mcollina opened this issue Nov 11, 2021 · 27 comments
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. memory Issues and PRs related to the memory management or memory footprint.

Comments

@mcollina
Copy link
Member

I have recently read the article about Olipan: https://v8.dev/blog/oilpan-library.

It might be worthwhile investigating if it's something we could use inside Node.js C++ internals to simplify some of our memory management and wrapping logic.

@mcollina
Copy link
Member Author

cc @nodejs/v8 @addaleax @jasnell

@Mesteery Mesteery added the c++ Issues and PRs that require attention from people who are familiar with C++. label Nov 11, 2021
@joyeecheung
Copy link
Member

joyeecheung commented Nov 11, 2021

I think it's worth giving a try. IIUC, we didn't use EmbedderHeapTracer because that only works with BaseObjects that called MakeWeak(), Oilpan is more powerful so it should work with other BaseObjects (and other types of native objects that aren't using BaseObject) - I am not sure if this would necessarily make our memory management simpler though, since Oilpan itself seems more complex than what we use right now to track our native objects.

@joyeecheung joyeecheung added the memory Issues and PRs related to the memory management or memory footprint. label Nov 11, 2021
@DerekNonGeneric DerekNonGeneric changed the title Usage of Olipan / V8 C++ Garbage Collector Usage of Oilpan / V8 C++ Garbage Collector Nov 11, 2021
@mlippautz
Copy link

If you have some bindings use cases that are problematic to get right, or if you just want a GCed C++ object model, then Oilpan may indeed be a great fit.

I'd say the only caveat right now with Oilpan shipped through V8 is that it doesn't (yet) come with containers. std containers are supported but require using non-incremental tracing for the embedder (--no-incremental-marking-wrappers as of today but the name may change).

@joyeecheung You are right, EmbedderHeapTracer doesn't actually manage C++ memory which is why you generally need MakeWeak() as well. Oilpan is a full-fledged C++ GC, so yes, it is more complicated. However, none of that should be a concern to the embedder. The important piece there is that both types of references, V8->embedder and embedder->V8, are supported by default with V8 so there's no manual management involved there.

@SalvatorePreviti
Copy link

SalvatorePreviti commented Dec 5, 2021

I tried to used Oilpan (cppgc) in a node addon (node 16 and node 17 and also nightly build), but it seems to not be fully linked/exposed. While I can compile a node addon using it, I get runtime linking errors when I try to use it.

Error: dlopen(/xxxxxxxxxx/singularity.node, 0x0001): symbol not found in flat namespace '__ZN5cppgc8internal21RegisteredGCInfoIndexC1EPFvPvEb'
at Object.Module._extensions..node (node:internal/modules/cjs/loader:1179:18)

Is there a plan to support it?

@mcollina
Copy link
Member Author

mcollina commented Dec 5, 2021

Is there a plan to support it?

Not at this point. I'm not sure what it would entail for our ABI stability.

@SalvatorePreviti
Copy link

Makes sense. It seems Google is stabilising the cppgc API. Please, keep us updated, I am really interested in the possibility to use it on node add-ons :)
Thank you.

@mcollina
Copy link
Member Author

mcollina commented Dec 6, 2021

@SalvatorePreviti if you would like to open a PR, we can discuss it there.

@SalvatorePreviti
Copy link

SalvatorePreviti commented Dec 6, 2021

A small update, tried to build locally the latest version of node on master and write a unit test in test/addons - it seems that with that oilpan is properly linked and exposed and can be used.
However, cppgc::DefaultPlatform::InitializeProcess is not called yet by anybody, and this should be called once per process to be able to create an heap, and the platform should be exposed. Thinking to add it to ExecuteBootstrapper

For reference, this work in progress branch https://github.com/SalvatorePreviti/node/tree/expose-oilpan

@addaleax
Copy link
Member

addaleax commented Dec 6, 2021

However, cppgc::DefaultPlatform::InitializeProcess is not called yet by anybody, and this should be called once per process to be able to create an heap, and the platform should be exposed. Thinking to add it to ExecuteBootstrapper

This should probably go into InitializeOncePerProcess() if it is comparable to V8::Initialize() in when it should be called 👍

@SalvatorePreviti
Copy link

SalvatorePreviti commented Dec 6, 2021

Interestingly, there is deps/v8/include/v8_cppgc.h currently not exported by node.

This seems to be targeted to be used for a common GC heap between JS and C++.
https://chromium.googlesource.com/v8/v8.git/+/HEAD/include/v8-cppgc.h

AttachCppHeap in v8-isolate.h is marked as experimental
https://chromium.googlesource.com/v8/v8.git/+/HEAD/include/v8-isolate.h#940

From a first look it seems to work as a replacement to cppgc::DefaultPlatform::InitializeProcess

At the moment I am still not able to make it work, I get a Signal: 11

@SalvatorePreviti
Copy link

SalvatorePreviti commented Dec 6, 2021

It seems that this should wait a bit, it seems cppgc and v8_cppgc.h is currently under active development, last commit for v8_cppgc 5 days ago and for cppgc 5 hours ago.
However, from my current understanding of the APIs, and if they do not change, I believe we should initialize a shared CppHeap from v8_cppgc.h at process startup and set it every time to an Isolate when it is relevant to do so (or always)

@SalvatorePreviti
Copy link

@mlippautz maybe can give some information or make a bit of light on this and the stability of cppgc and v8_cppgc.h?

@mlippautz
Copy link

cppgc has shipped in v9.4 for Blink where it's used as the production garbage collector for C++.

We are actively working on cppgc and its APIs but everything that you can find on the public API surface in e.g.include/cppgc/* or in include/v8-cppgc.h is considered stable and follows V8's general API stability. In other words, as for all other APIs, they are not set in stone but will follow the regular deprecation cycles and we will try hard to allow a smooth migration when removing things.

@owinebar
Copy link

owinebar commented Jan 22, 2022

@mlippautz - I've been going through the doc and source for the last few days and wonder if part of the issue is that the V8 embedding documentation is out of date. It says there are basically 2 types allowed to reference "JavaScript objects", Local and Persistent. I don't think that is true now.

First, the term "JavaScript objects" is a bit misleading. The subject should really be "objects managed by the V8 GC", of which some are owned by V8 ("proper JS objects") and others are owned by emdedders and subclassed from the GarbageCollected template. The former are only on the stack wrapped by Local handles, where the latter must never be allocated on the stack and only as raw pointers.

For now, the definitive reference for v8 embedders that want to either use GarbageCollected objects or export that capability would seem to be Blink's embedding of cppgc as provided by V8.

@SalvatorePreviti As for the initialization segfault, there is a test program "cppgc_hello_world" in v8. When built as part of building v8 targets, this program segfaults. When built with the "cppgc_is_standalone" option (which forbids building any targets other than cppgc or related tests), that test program prints the expected message. If you look at Blink's process_heap.cc, it just calls gin::InitializeCppgcFromV8Platform() which just guards against calling the initializer more than once, and calls cppgc::InitializeProcess on the first attempt.

@owinebar
Copy link

I opened a documentation issue on Monorail for the misalignment of the docs (including code comments) with the code.
The segfault was reported a while ago as Issue 12427, but it wasn't noted that the test succeeds when cppgc is built as a standalone library.

@owinebar
Copy link

owinebar commented Jan 28, 2022

@mcollina The active release of node is pretty far behind V8 with respect to cppgc. The include/cppgc/README.md file in 16.13.12 is 2 years old, from when oilpan was first introduced in V8. 17.4.0 has updated that to a version from a few months ago, but some of the simplifications in the current V8 interface for putting V8 objects into managed C++ objects have not been pulled in yet. It looks like @targos has been bringing the newer cppgc code into the current node.
@mlippautz has been kindly explaining how embedders can introduce "managed C++" to JS edges and vice versa at cppgc documentation issue on monorail . That use case appears to have been significantly streamlined. Now JS objects can be introduced into managed C++ objects using a Traced referencev8::object without any intervening persistent or global handle that would be treated as a root by the GC
I imagine the big question for node developers is how you want to define BaseObject. Do you continue with the reference counting scheme on a global handle, or make it a managed C++ object? I would think it's only controversial to the extent it could be a barrier to supporting other JS runtimes with the same ABI. The standalone version of cppgc might ameliorate that concern, I don't know. Otherwise it seems like a huge win to me.

@mcollina
Copy link
Member Author

@mcollina The active release of node is pretty far behind V8 with respect to cppgc. The include/cppgc/README.md file in 16.13.12 is 2 years old, from when oilpan was first introduced in V8. 17.4.0 has updated that to a version from a few months ago, but some of the simplifications in the current V8 interface for putting V8 objects into managed C++ objects have not been pulled in yet. It looks like @targos has been bringing the newer cppgc code into the current node.

@mlippautz has been kindly explaining how embedders can introduce "managed C++" to JS edges and vice versa at cppgc documentation issue on monorail . That use case appears to have been significantly streamlined. Now JS objects can be introduced into managed C++ objects using a Traced referencev8::object without any intervening persistent or global handle that would be treated as a root by the GC

This looks like a massive win!

I imagine the big question for node developers is how you want to define BaseObject. Do you continue with the reference counting scheme on a global handle, or make it a managed C++ object? I would think it's only controversial to the extent it could be a barrier to supporting other JS runtimes with the same ABI. The standalone version of cppgc might ameliorate that concern, I don't know. Otherwise it seems like a huge win to me.

We are current not supporting other JS runtimes anymore.
A PoC would be really useful.

@addaleax @jasnell wdyt?

@addaleax
Copy link
Member

@mcollina I think it’s fine to try this out for sure.

@owinebar
Copy link

owinebar commented Jan 28, 2022 via email

@owinebar
Copy link

owinebar commented Nov 5, 2022

Has there been any work on this?

dharesign added a commit to dharesign/node that referenced this issue Dec 1, 2022
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

Refs: nodejs#40786
dharesign added a commit to dharesign/node that referenced this issue Dec 1, 2022
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

Refs: nodejs#40786
@dharesign
Copy link
Contributor

We are making use of cppgc in node add-ons. As a result we have modified tools/install.py to include all the cppgc related headers. We are initializing cppgc in our add-on, rather than modifying Node.js to initialize it, but we can change that. I created a PR to get the ball rolling: #45704

dharesign added a commit to dharesign/node that referenced this issue Dec 9, 2022
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

Refs: nodejs#40786
dharesign added a commit to dharesign/node that referenced this issue Dec 12, 2022
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

Refs: nodejs#40786
dharesign added a commit to dharesign/node that referenced this issue Dec 12, 2022
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

Refs: nodejs#40786
joyeecheung pushed a commit to joyeecheung/node that referenced this issue Jul 5, 2023
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

Refs: nodejs#40786
joyeecheung added a commit to dharesign/node that referenced this issue Jul 5, 2023
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
joyeecheung added a commit to dharesign/node that referenced this issue Jul 5, 2023
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
joyeecheung added a commit to dharesign/node that referenced this issue Jul 5, 2023
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
joyeecheung added a commit to dharesign/node that referenced this issue Jul 21, 2023
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
@joyeecheung
Copy link
Member

joyeecheung commented Jul 21, 2023

I drafted a design doc on how to integrate Oilpan into Node.js: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit

I also did a POC by migrating CompiledFnEntry while I was testing #45704, although CompiledFnEntry is probably going away anyway because of #48510 but it was good exercise and helped me a bit when figuring out the design. The next target, as described in the doc, would be AliasedBuffers.

joyeecheung added a commit to dharesign/node that referenced this issue Jul 21, 2023
Initialize cppgc, create a CppHeap, and attach it to the Isolate.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
joyeecheung added a commit to dharesign/node that referenced this issue Jul 21, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
@mcollina
Copy link
Member Author

I reviewed this and the plan is sound to fix the problem with ShadowRealm.
I think it might also give us a bit more garbage collection performance.

joyeecheung added a commit to joyeecheung/node that referenced this issue Aug 2, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
nodejs-github-bot pushed a commit that referenced this issue Aug 11, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: #40786
PR-URL: #45704
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
martenrichter pushed a commit to martenrichter/node that referenced this issue Aug 13, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
PR-URL: nodejs#45704
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
PR-URL: nodejs#45704
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Ceres6 pushed a commit to Ceres6/node that referenced this issue Aug 14, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
PR-URL: nodejs#45704
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
joyeecheung added a commit to joyeecheung/node that referenced this issue Aug 15, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
PR-URL: nodejs#45704
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
joyeecheung added a commit to joyeecheung/node that referenced this issue Aug 15, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
PR-URL: nodejs#45704
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
joyeecheung added a commit to joyeecheung/node that referenced this issue Aug 17, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: nodejs#40786
PR-URL: nodejs#45704
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
UlisesGascon pushed a commit that referenced this issue Aug 18, 2023
Original commit message:

    [cppgc] expose wrapper descriptor on CppHeap

    This makes it possible for embedders to:

    1. Avoid creating wrapper objects that happen to have a layout that
      leads V8 to consider the object cppgc-managed while it's not.
      Refs: #43521
    2. Create cppgc-managed wrapper objects when they do not own the
       CppHeap. Refs: #45704

    Bug: v8:13960
    Change-Id: If31f4d56c5ead59dc0d56f937494d23d631f7438
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/4598833
    Reviewed-by: Michael Lippautz <mlippautz@chromium.org>
    Commit-Queue: Michael Lippautz <mlippautz@chromium.org>
    Cr-Commit-Position: refs/heads/main@{#88490}

Refs: v8/v8@9327503
PR-URL: #48660
Backport-PR-URL: #49187
Reviewed-By: Chengzhong Wu <legendecas@gmail.com>
Reviewed-By: Jiawen Geng <technicalcute@gmail.com>
Refs: #40786
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
UlisesGascon pushed a commit that referenced this issue Aug 18, 2023
Previously we hard-code a wrapper id to be used in BaseObjects
to avoid accidentally triggering cppgc on these non-cppgc-managed
objects, but hard-coding can be be hacky and result in mismatch
when we start to create CppHeap ourselves. This patch makes it
more robust by deducing non-cppgc id from the effective cppgc id,
if there is one.

PR-URL: #48660
Backport-PR-URL: #49187
Refs: v8/v8@9327503
Reviewed-By: Chengzhong Wu <legendecas@gmail.com>
Reviewed-By: Jiawen Geng <technicalcute@gmail.com>
Refs: #40786
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
UlisesGascon pushed a commit that referenced this issue Aug 18, 2023
This patch:

- Initializes cppgc in InitializeOncePerProcess() when
  kNoInitializeCppgc is not set
- Create a CppHeap and attach it to the Isolate when
  there isn't one already during IsolateData initialization.
  The CppHeap is detached and terminated when IsolateData
  is freed.
- Publishes the cppgc headers in the tarball.

This allows C++ addons to start using cppgc to manage objects.

A helper node::SetCppgcReference() is also added to help addons
enable cppgc tracing in a user-defined object.

Co-authored-by: Joyee Cheung <joyeec9h3@gmail.com>
Refs: #40786
PR-URL: #45704
Backport-PR-URL: #49187
Refs: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
PR-URL: #48660
Refs: v8/v8@9327503
@joyeecheung
Copy link
Member

With CompiledFnEntry already gone, I went on to experiment migration on a few internal classes. I have a WIP branch to figure out a better strategy of migration and track what internal utilities need to be updated/reinvented. I will update more detailed findings in the doc, but just to update a finding of mine - I think it's better to migrate in this order (which is actually the opposite of my original plan):

  • First go with "detached BaseObjects" - ones that get released when JS part gets released i.e. the ones that call MakeWeak(), especially the ones that don't have additional references from C++ land
  • Then the long-lived ones, which usually have additional management/references abstracted but those abstractions tend to be centralized so it's easier to migrate than the last one e.g. BindingData, AliasedBuffer
  • Then the ones that have strong persistent handles (since the migration away from ref-counting probably needs to be rolled out gradually and we need to figure out the interop of the two)

There are still several helpers I need to abstract out, especially for integrations of heap snapshots/startup snapshots. I may send a PR soon-ish when I think they are stable/generic enough.

@joyeecheung
Copy link
Member

joyeecheung commented Dec 3, 2023

Some interesting update: I noticed that moving from weak BaseObject to Oilpan-management (without cleanup queue book-keeping) is about 2.5x faster:

misc/object-wrap.js method="ExampleCppgcObject" n=1000000: 8,113,612.185256452
misc/object-wrap.js method="ExampleBaseObject" n=1000000: 3,218,022.6465318813

The BaseObject overhead can actually become bottlenecks. I opened #51017 to show a POC of migrating crypto::Hash to Oilpan with a noticeable speedup with some helpers that I've used to test migration on various objects. (I think the overhead partly comes from global handle initialization and partly comes from cleanup queue book-keeping. Haven't looked deeper into it yet as those tend to be inlined)

@owinebar
Copy link

I drafted a design doc on how to integrate Oilpan into Node.js: https://docs.google.com/document/d/1ny2Qz_EsUnXGKJRGxoA-FXIE2xpLgaMAN6jD7eAkqFQ/edit

Thanks for this - it's useful even for non-node embedders.

@joyeecheung
Copy link
Member

joyeecheung commented Mar 31, 2024

Some updates:

  • In wip: crypto: use cppgc to manage Hash #51017 we see a 27% speedup in crypto.createHash() performance after migration, and ~10% speedup in the whole hashing performance in some cases. But it's currently blocked by the second point.
  • Upon discussion in the design doc it looks like we'll need a new API in V8 to track externally managed memory held by cppgc-based objects in the heap snapshots, otherwise those information would be lost. This affects most of our native objects as they usually maintains a bit of externally managed memory. Still figuring out how it should look like.
  • Some objects hold some sort of v8::Global<v8::Data> alive and I think they would benefit the most from the migration because it has been very tricky to get their lifetime management correct via v8::Global. I have a hacky implementation wip: migrate ContextifyScript to cppgc #52295 that involves a pretty breaking V8 API change for migrating ContextifyScript - the memory tests are still passing and there is a bit of performance boost (numbers are after I changed the benchmark fixture to compile a tiny script). This also will allow us to track v8::Data in the heap snapshot. Working on an upstream patch.
                                                                                                                                    confidence improvement accuracy (*)   (**)  (***)
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/snapshot/typescript.js' type='with-dynamic-import-callback'                   -0.91 %       ±1.84% ±2.45% ±3.20%
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/snapshot/typescript.js' type='without-dynamic-import-callback'                -2.79 %       ±4.24% ±5.70% ±7.55%
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/syntax/good_syntax.js' type='with-dynamic-import-callback'                     3.39 %       ±4.28% ±5.70% ±7.42%
vm/compile-script-in-isolate-cache.js n=1000 filename='test/fixtures/syntax/good_syntax.js' type='without-dynamic-import-callback'         ***      6.93 %       ±3.08% ±4.11% ±5.35%
  • It's not all sunshine and rainbows though. It seems some APIs can take a perf hit after the migration - e.g. V8 serdes classes. I suspect this is very related to the way we notify V8 about externally-managed array buffers and we are probably under-reporting and this is amplified once we use V8 to trace objects that hold them alive. Of course stats are based on microbenchmarks so it's hard to say whether they are really valid but it does seem after migration V8 spends more time on GC triggered by ArrayBuffer allocation, and even the maximum memory used increased. Still investigating why (it also seems if I add some ad-hoc Isolate::AdjustAmountOfExternalAllocatedMemory() calls, there is a performance boost again, so I am pretty sure it's our pre-existing bug)
See regression numbers
                                                                   confidence improvement accuracy (*)   (**)  (***)
v8/deserialize-array.js n=10000 len=1024 type='array'                     ***    -40.29 %       ±1.62% ±2.15% ±2.80%
v8/deserialize-array.js n=10000 len=1024 type='bigint-typed-array'        ***    -22.46 %       ±1.11% ±1.48% ±1.93%
v8/deserialize-array.js n=10000 len=1024 type='typed-array'               ***     -6.79 %       ±3.07% ±4.11% ±5.41%
v8/deserialize-array.js n=10000 len=16 type='array'                       ***     -8.25 %       ±1.63% ±2.17% ±2.83%
v8/deserialize-array.js n=10000 len=16 type='bigint-typed-array'          ***     -9.01 %       ±2.56% ±3.41% ±4.46%
v8/deserialize-array.js n=10000 len=16 type='typed-array'                 ***     -6.63 %       ±2.40% ±3.22% ±4.23%
v8/deserialize-array.js n=10000 len=256 type='array'                      ***    -13.26 %       ±1.07% ±1.43% ±1.86%
v8/deserialize-array.js n=10000 len=256 type='bigint-typed-array'         ***    -12.58 %       ±1.80% ±2.40% ±3.13%
v8/deserialize-array.js n=10000 len=256 type='typed-array'                ***     -7.78 %       ±2.55% ±3.41% ±4.48%
v8/deserialize-values.js n=100000 type='object'                           ***    -25.36 %       ±0.82% ±1.09% ±1.42%
v8/deserialize-values.js n=100000 type='string'                           ***    -10.81 %       ±0.94% ±1.25% ±1.63%
v8/serialize-array.js n=10000 len=1024 type='array'                       ***     -6.28 %       ±1.14% ±1.52% ±1.98%
v8/serialize-array.js n=10000 len=1024 type='bigint-typed-array'          ***    -23.61 %       ±1.07% ±1.43% ±1.86%
v8/serialize-array.js n=10000 len=1024 type='typed-array'                 ***    -30.91 %       ±1.53% ±2.05% ±2.68%
v8/serialize-array.js n=10000 len=16 type='array'                         ***    -24.23 %       ±1.63% ±2.17% ±2.82%
v8/serialize-array.js n=10000 len=16 type='bigint-typed-array'            ***    -32.05 %       ±1.49% ±1.99% ±2.60%
v8/serialize-array.js n=10000 len=16 type='typed-array'                   ***    -25.83 %       ±1.33% ±1.78% ±2.31%
v8/serialize-array.js n=10000 len=256 type='array'                        ***    -14.19 %       ±1.44% ±1.92% ±2.51%
v8/serialize-array.js n=10000 len=256 type='bigint-typed-array'           ***    -29.26 %       ±1.51% ±2.01% ±2.63%
v8/serialize-array.js n=10000 len=256 type='typed-array'                  ***    -33.04 %       ±1.52% ±2.03% ±2.66%
v8/serialize-values.js n=100000 type='object'                             ***    -22.56 %       ±2.27% ±3.03% ±3.97%
v8/serialize-values.js n=100000 type='string'                             ***    -22.43 %       ±1.70% ±2.27% ±2.99%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. memory Issues and PRs related to the memory management or memory footprint.
Projects
None yet
Development

No branches or pull requests

8 participants