Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

N-API: memory leak with call to napi_create_function #28988

Closed
anthony-tuininga opened this issue Aug 6, 2019 · 38 comments
Closed

N-API: memory leak with call to napi_create_function #28988

anthony-tuininga opened this issue Aug 6, 2019 · 38 comments
Labels
node-api Issues and PRs related to the Node-API.

Comments

@anthony-tuininga
Copy link
Contributor

anthony-tuininga commented Aug 6, 2019

  • Version: v12.8.0
  • Platform: Linux atc 5.1.20-300.fc30.x86_64 #1 SMP Fri Jul 26 15:03:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Subsystem:

This is again with respect to the node-oracledb driver (Oracle Database driver for Node.js). I have adjusted the code once again to eliminate any requirement for an Oracle database. I'll attach the test case shortly after this issue is created.

For Oracle Objects I have created a class dynamically at run-time based on the name. This can be seen in the test case as a call to conn.getDbObjectClass(). Internally, the call to conn.getDbObjectClass() calls into a C module which then makes a call to conn._getDbObjectClassJS(). This function then builds the class and stores the result in a cache on the connection called _dbObjectClasses.

When the connection is closed by a call to conn.close(), I have to iterate over the entries in the cache and deliberately break the prototype chain; otherwise, the classes that are built are never garbage collected.

But, even if I do that and the classes are garbage collected, there is still a memory leak. The pattern shows that there are periods of relatively stable memory usage followed by jumps in memory usage. The periods of stable memory usage grow longer as the number of iterations increases but the jump in memory also increases. It would seem that a list of some kind is being populated with intermittent, increasing size allocation. Looking at the heap memory in the Chrome development tools indicates that the memory is all found in noscript_shared_function_infos -- but I'm not sure what that means! Interestingly, though, if I remove the call to napi_define_properties in njsDbObjectType_populate(), the memory leak goes away.

I'm not sure if this is related to the fix introduced in PR #27805 or not, but version 12.0 is the first version that has this issue. Prior to 12.0 the code suffers from the memory leak that that issue corrected.

I hope this is sufficient to discover the source of this issue. Let me know if you need anything else!

@anthony-tuininga
Copy link
Contributor Author

anthony-tuininga commented Aug 6, 2019

The test case can be seen in the zip file. Included is a sample stats.txt that demonstrates the memory leak.

Run the tests with

NODE_PATH=$PWD/lib node --expose-gc demo.js

@addaleax addaleax added the node-api Issues and PRs related to the Node-API. label Aug 6, 2019
@anthony-tuininga
Copy link
Contributor Author

I just tried this with Node.js 12.8 (just released) and the issue is still there, unfortunately.

@anthony-tuininga anthony-tuininga changed the title N-API: memory leak with descriptors N-API: memory leak with call to napi_create_function Aug 9, 2019
@anthony-tuininga
Copy link
Contributor Author

Interestingly, a call to napi_define_properties() is not required. A simple call to napi_create_function(), even if that object is never tied to any other object, is all that is required.

@anthony-tuininga
Copy link
Contributor Author

anthony-tuininga commented Aug 9, 2019

Here is a much simplified package that contains only a single method called leak_memory() and all it does is calls napi_create_function(). It leaks memory quite rapidly!

issue_28988_simplified.zip

Run it with

node --expose-gc demo.js

You can wait until it finishes or stop it after a few hundred thousand iterations and then examine the file stats.txt, or you can look at the output from the top command.

@legendecas
Copy link
Member

Here is a much simplified package that contains only a single method called leak_memory() and all it does is calls napi_create_function(). It leaks memory quite rapidly!

issue_28988_simplified.zip

Run it with

node --expose-gc demo.js

You can wait until it finishes or stop it after a few hundred thousand iterations and then examine the file stats.txt, or you can look at the output from the top command.

Major objects that retained in heap in this case were TickObject which would gets collected once they was been ran.

With changing the while loop in function run to looping with setInterval, the seemingly major leaking behavior had just gone.

Though there still exists a minor heap size increasing on noscript_shared_function_infos.

@anthony-tuininga
Copy link
Contributor Author

Which version did you use? I am using Node.js 12.8. I modified demo.js to look like this:

const issue_28988 = require('./build/Release/issue_28988.node');
const fs = require('fs');

const statsFileName = "stats.txt";
const maxIters = 2915000;

let numIters = 0;

function performIter() {
  for (let i = 0; i < 1000; i++) {
    numIters++;
    issue_28988.leak_memory();
  }
  console.log("Processed", numIters, "iterations...");
  global.gc();
  const stats = process.memoryUsage();
  const text1 = `${numIters},${stats.rss},${stats.heapTotal},`;
  const text2 = `${stats.heapUsed},${stats.external}\n`;
  fs.appendFileSync(statsFileName, text1 + text2);
}


async function run() {
  fs.writeFileSync(statsFileName, "Num Iters,RSS,Heap Total,Heap Used,Ext\n");
  setInterval(performIter, 0);
}

run()

and I didn't see a whole lot of difference between the two. I see a leak of about 12 bytes/iteration but there are periods of stability followed by a jump. The periods of stability lengthen but the jumps also get bigger. 12 bytes doesn't seem like a lot....until you realize that if you build a class with a number of getter/setter and specialized functions that adds up quickly. If you adjust the call to the leak_memory function to do the following instead:

    for (let j = 0; j < 20; j++) {
      issue_28988.leak_memory();
    }

then the leak jumps to about 250 bytes/iteration. This quite closely matches the behaviour I was seeing with the full test case which probably calls napi_create_function() internally about 20 times.

@mhdawson
Copy link
Member

mhdawson commented Aug 14, 2019

I seem to remember @gabrielschulhof mentioning that V8 never collects functions but I'll have to confirm if that is the case and if we have it documented somewhere.

EDIT: functions created in native code that is.

@mhdawson
Copy link
Member

mhdawson commented Aug 14, 2019

I think @gabrielschulhof is still away for another week and I've not been able to find documentation or earlier issues where that was discussed @hashseed can you confirm one way or the other if V8 collects functions created in native code?

@addaleax
Copy link
Member

V8 has something called “template instantiations cache” whose entries refer back to the Function instances indexed by their FunctionTemplates – I could imagine the underlying issue here is that those entries are not cleared when the FunctionTemplate is garbage collected. But this may require a V8 person to confirm and fix.

@hashseed
Copy link
Member

hashseed commented Aug 15, 2019

@verwaest knows the details here.

Yeah I think we keep instantiated functions around for faster instantiations.

@mhdawson
Copy link
Member

@verwaest can you confirm

@verwaest
Copy link
Contributor

Function template::New results in cached templates, whereas Function::New creates an uncached template.

Cached here doesn't mean it's a weak pointer: in the browser for correctness they /need/ to be unique instances that have identity. Since they are mutable objects like any other it's observable, and the fact that they aren't instantiated eagerly is simply a memory and performance optimization. So the cache isn't there (just) for performance, it's actually needed for correctness in the browser.

This cached way of using templates is possibly not what you want. Doesn't Function::New fit your usecase?

@anthony-tuininga
Copy link
Contributor Author

This cached way of using templates is possibly not what you want. Doesn't Function::New fit your usecase?

I am not using the C++ API directly, but indirectly through N-API. Is there a way to create a function with Function::New() using N-API?

@addaleax
Copy link
Member

napi_create_function() uses Function::New, so that might not be the issue then.

@mhdawson
Copy link
Member

This seems to be the code for the method Function:new that we call
js_native_api.cc

...
 v8::MaybeLocal<v8::Function> maybe_function =
      v8::Function::New(context,
                        v8impl::FunctionCallbackWrapper::Invoke,
                        cbdata);
...

api.cc (in v8 source)

MaybeLocal<Function> Function::New(Local<Context> context,
                                   FunctionCallback callback, Local<Value> data,
                                   int length, ConstructorBehavior behavior,
                                   SideEffectType side_effect_type) {
  i::Isolate* isolate = Utils::OpenHandle(*context)->GetIsolate();
  LOG_API(isolate, Function, New);
  ENTER_V8_NO_SCRIPT_NO_EXCEPTION(isolate);
  auto templ =
      FunctionTemplateNew(isolate, callback, data, Local<Signature>(), length,
                          true, Local<Private>(), side_effect_type);
  if (behavior == ConstructorBehavior::kThrow) templ->RemovePrototype();
  return templ->GetFunction(context);
}

which looks like it is using function templates

@mhdawson
Copy link
Member

In

FunctionTemplateNew(isolate, callback, data, Local<Signature>(), length,
                          true, Local<Private>(), side_effect_type);

We get the default values for everything after data

@mhdawson
Copy link
Member

mhdawson commented Aug 26, 2019

@verwaest is it possible that we'd need to pass a different combination of parameters to get uncached templates or does FunctionTemplateNew create the "uncached" templates by default if you don't include the optional parameters?

If it's not that then my other guess might be something related to the management of the
callback data:

js_native_api.cc

v8::Local<v8::Value> cbdata =
      v8impl::CreateFunctionCallbackData(env, cb, callback_data);

@mhdawson
Copy link
Member

The bundles look to be being bound and then deleted in equal amounts, with the delete being 1000 behind in the runs I did, likely because they would be collected in the next gc. So

@mhdawson
Copy link
Member

Watching the heap it does seem to be increasing and if I limit with --max-old-space-size it does run out of memory...

@mhdawson
Copy link
Member

If I effectively make the napi_create_function a no-op (after the first invocation so the demo still runs) then we don't see the memory associated with the heap increasing.

@mhdawson
Copy link
Member

So the difference in terms of memory increase in the heap seems to be down to these lines. If I comment them out, then heap stays consistent. If I uncomment then I see heap (as reported by --trace-gc) increases and with --max-old-space-size=20 the demo fails with an out of memory.

  v8::Local<v8::Value> cbdata;
  v8::Local<v8::Context> context = env->context();
  v8::MaybeLocal<v8::Function> maybe_function =
      v8::Function::New(context,
                        Invoke,
                        cbdata);

With Invoke being

static void Invoke(const v8::FunctionCallbackInfo<v8::Value>& info){
};

@mhdawson
Copy link
Member

Full hacked version of napi_create_function in js_native_api_v8.cc in case anybody else wants to try to recreate:

static int count =0;
static void Invoke(const v8::FunctionCallbackInfo<v8::Value>& info){
};
napi_status napi_create_function(napi_env env,
                                 const char* utf8name,
                                 size_t length,
                                 napi_callback cb,
                                 void* callback_data,
                                 napi_value* result) {
if (count < 1){
count++;
  NAPI_PREAMBLE(env);
  CHECK_ARG(env, result);
  CHECK_ARG(env, cb);

  v8::Isolate* isolate = env->isolate;
  v8::Local<v8::Function> return_value;
  v8::EscapableHandleScope scope(isolate);
  v8::Local<v8::Value> cbdata =
      v8impl::CreateFunctionCallbackData(env, cb, callback_data);

  RETURN_STATUS_IF_FALSE(env, !cbdata.IsEmpty(), napi_generic_failure);

  v8::Local<v8::Context> context = env->context();
  v8::MaybeLocal<v8::Function> maybe_function =
      v8::Function::New(context,
                        v8impl::FunctionCallbackWrapper::Invoke,
                        cbdata);
  CHECK_MAYBE_EMPTY(env, maybe_function, napi_generic_failure);

  return_value = scope.Escape(maybe_function.ToLocalChecked());

  if (utf8name != nullptr) {
    v8::Local<v8::String> name_string;
    CHECK_NEW_FROM_UTF8_LEN(env, name_string, utf8name, length);
    return_value->SetName(name_string);
  }

  *result = v8impl::JsValueFromV8LocalValue(return_value);

  return GET_RETURN_STATUS(env);
} else {
  v8::Local<v8::Value> cbdata;
  v8::Local<v8::Context> context = env->context();
  v8::MaybeLocal<v8::Function> maybe_function =
      v8::Function::New(context,
                        Invoke,
                        cbdata);
  *result = NULL;
  return(napi_ok);
}
}

@mhdawson
Copy link
Member

My experiments were using node master (probably from a few weeks ago) with these v8 version numbers:

#define V8_MAJOR_VERSION 7
#define V8_MINOR_VERSION 6
#define V8_BUILD_NUMBER 303
#define V8_PATCH_LEVEL 28

@mhdawson
Copy link
Member

@verwaest looks like simply calling v8::Function::New is resulting in objects in the heap and we actuall fail with 'JavaScript heap out of memory' if we limit to a lower heap size. I believe I've removed all of the N-API code out of the path.

@mhdawson
Copy link
Member

mhdawson commented Sep 4, 2019

@verwaest, @hashseed can you help confirm if what I've seen matches expected behaviour or not?

@gabrielschulhof
Copy link
Contributor

@mhdawson I ran the following code as both a Node.js addon and as a V8 cctest:

// ---8<------------------------------------------------------------------------
struct WeakRefCounters {
  inline void inc_created() {
    created++;
    maybe_print();
  }
  inline void inc_collected() {
    collected++;
    maybe_print();
  }
  inline void maybe_print() {
    if (!(created % 1000)) {
      fprintf(stderr, "%lu - %lu = %lu\n", created, collected,
        created - collected);
    }
  }
  size_t created;
  size_t collected;
};

class Ref {
 public:
  Ref(v8::Isolate* isolate,
      v8::Local<v8::Function> fn,
      WeakRefCounters* counters):
    pers(isolate, fn), counters(counters) {
    pers.SetWeak(this, WeakCallback, v8::WeakCallbackType::kParameter);
  }
  ~Ref() {
    pers.Reset();
    counters->inc_collected();
  }
  static void WeakCallback(const v8::WeakCallbackInfo<Ref>& info) {
    delete info.GetParameter();
  }
  v8::Persistent<v8::Function> pers;
  WeakRefCounters* counters;
};

static void DummyFunction(const v8::FunctionCallbackInfo<v8::Value>& info) {}

// Entry point. Create a new function and track its life cycle.
static v8::Local<v8::Function>
NewFunction(v8::Isolate* isolate, WeakRefCounters* counters) {
  v8::Local<v8::Function> fn =
    v8::Function::New(isolate->GetCurrentContext(), DummyFunction)
      .ToLocalChecked();

  counters->inc_created();
  new Ref(isolate, fn, counters);

  return fn;
}
// ---8<------------------------------------------------------------------------

I wrote the following V8 test:

#include "src/execution/isolate.h"
#include "src/heap/factory.h"
#include "src/objects/name-inl.h"
#include "src/utils/ostreams.h"
#include "src/objects/objects.h"
#include "test/cctest/cctest.h"

namespace v8 {
namespace internal {

TEST(FunctionLeak) {
  WeakRefCounters counters = { 0, 0 };
  CcTest::InitializeVM();
  Isolate* isolate = CcTest::i_isolate();
  HandleScope scope(isolate);
  v8::Isolate* v8_isolate = reinterpret_cast<v8::Isolate*>(isolate);

  for (size_t i = 0;; i++) {
    HandleScope inner_scope(isolate);
    NewFunction(v8_isolate, &counters);
    if (!(i % 10000)) {
      CcTest::CollectGarbage(i::NEW_SPACE);
      CcTest::CollectAllGarbage();
    }
  }
}

}  // namespace internal
}  // namespace v8

I also wrote a corresponding Node.js native addon test:

#include <stdio.h>
#include <node.h>

static void FunctionFactory(const v8::FunctionCallbackInfo<v8::Value>& info) {
  info
    .GetReturnValue()
    .Set(NewFunction(
      info.GetIsolate(),
      static_cast<WeakRefCounters*>(info.Data().As<v8::External>()->Value())));
}

static void DeleteCount(void* data) {
 delete static_cast<WeakRefCounters*>(data);
}

NODE_MODULE_INIT() {
  v8::Isolate* isolate = context->GetIsolate();  
  WeakRefCounters* refcounts = new WeakRefCounters({0, 0});
  node::AddEnvironmentCleanupHook(isolate, DeleteCount, refcounts);
  exports->Set(context,
               v8::String::NewFromUtf8(isolate,
                                  "functionFactory",
                                  v8::NewStringType::kNormal).ToLocalChecked(),
               v8::Function::New(context,
                                 FunctionFactory,
                                 v8::External::New(isolate, refcounts))
                   .ToLocalChecked())
      .FromJust();
}
const leak = require('bindings')('leak');

function makeNewFunction() {
  leak.functionFactory();
}

while(true) {
  makeNewFunction();
}

I ran the V8 test with

./cctest --max-heap-size=8 test-function-leak/FunctionLeak

and the Node.js addon with

node --max-heap-size=8 ./index.js

The V8 test did not seem to leak, whereas the Node.js addon ran out of heap space very quickly.

Now, it's true that I ran the V8 test on its master, rather than on whatever version of V8 we have in Node.js master, so the leak may have been fixed. Alternatively, it may be that the settings with which we compile and launch V8 may be causing it to leak. I'll try and figure out how to make sure that the version of V8 I'm building separately is exactly the same as the version of V8 in the Node.js master source tree.

@gabrielschulhof
Copy link
Contributor

@mhdawson looks like this is a problem that was fixed. Looking at deps/v8/ChangeLog, I found that the version in master is 2.7.299. When I checked out that version of V8 in its tree and re-ran the cctest, it also ran out of heap space. So, looks like this'll be fixed when we upgrade past the fix in V8.

@Alxspb
Copy link

Alxspb commented Sep 24, 2019

@mhdawson
Copy link
Member

@gabrielf which version of V8 will the change be in? We are already at 7.7 in 12.x but I'm not sure I've 7.8 will be backported (@targos can you comment) and due to having upped the mininum OSX devtools level to accommodate later V8 versions in 13.x my guess is that 7.9 and later most likely will not be backported.

We may need to float the fix.

@mhdawson
Copy link
Member

Talked to @gabrielschulhof, sounds like the fix is in 7.8 and @targos and @addaleax are going to try to land that in 12.x #29694 (comment). If that does not pan out we may need to look at floating the relevant fix.

@mhdawson
Copy link
Member

@targos is there any chance that 7.8 will go into 12.x as a SemVer minor or do we need to look to see if we can float the change that @gabrielschulhof mentioned fixes this?

@targos
Copy link
Member

targos commented Oct 25, 2019

is there any chance that 7.8 will go into 12.x as a SemVer minor

Maybe, if someone can help to do the ABI compat patch

@targos
Copy link
Member

targos commented Oct 25, 2019

I opened #30109 to make it more visible

@gabrielschulhof
Copy link
Contributor

Current state:

Node.js branch V8 version good/bad
v10.x-staging 6.8.275
v12.x-staging 7.8.279
v14.x-staging 8.1.307

Since v10.x is going into maintenance it's unlikely that a new major version of V8 will be backported to it. So, this memory leak will go away completely only when we drop support for v10.x.

@mhdawson
Copy link
Member

@gabrielschulhof I wonder if we should add a test that validates this is not regressed at some point or do we think it should be covered by the testing on the V8 side?

@gabrielschulhof
Copy link
Contributor

@mhdawson IMO this is internal to V8.

@mhdawson
Copy link
Member

mhdawson commented May 4, 2020

@gabrielschulhof thanks.

@mhdawson
Copy link
Member

It looks like this is complete to me since 10.x is out of service. I'm going to close the issue. Is you feel that was not the right thing to do please re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
node-api Issues and PRs related to the Node-API.
Projects
None yet
Development

No branches or pull requests

9 participants