Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak in V8 CpuProfiler #23070

Closed
Flarna opened this issue Sep 24, 2018 · 26 comments
Closed

Memory Leak in V8 CpuProfiler #23070

Flarna opened this issue Sep 24, 2018 · 26 comments
Labels
confirmed-bug Issues with confirmed bugs. memory Issues and PRs related to the memory management or memory footprint. v8 engine Issues and PRs related to the V8 dependency.

Comments

@Flarna
Copy link
Member

Flarna commented Sep 24, 2018

  • Version: 8.12.0, 10.11.0, Master
  • Platform: Windows, Linux
  • Subsystem: V8

It seems there is a memory leak in V8 CPU Profiler. If have seen that RSS steadily increases but V8 heap stays constant. I did a fast heap profiling with MSVC and got following stack frames which seem to point to the leak:

v8::internal::Runtime_StackGuard
v8::internal::StackGuard::HandleInterrupts
v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions
v8::internal::Compiler::FinalizeCompilationJob
v8::internal::`anonymous namespace'::FinalizeOptimizedCompilationJob
v8::internal::`anonymous namespace'::RecordFunctionCompilation
v8::internal::CodeEventDispatcher::CodeCreateEvent
	v8::internal::ProfilerListener::RecordDeoptInlinedFrames
	v8::internal::ProfilerListener::RecordInliningInfo
	v8::internal::JITLineInfoTable::SetPosition
	v8::internal::ProfilerListener::NewCodeEntry

Stopping profiling doesn't release the memory.
I did also a fast test with NodeJS 10.11.0 and it seems the issue is not present there.

I will continue my investigations but any hint which fix in V8 would be needed to be backported to get this fixed is welcome.

@addaleax addaleax added v8 engine Issues and PRs related to the V8 dependency. memory Issues and PRs related to the memory management or memory footprint. v8.x labels Sep 24, 2018
@addaleax
Copy link
Member

@nodejs/v8

@ofrobots
Copy link
Contributor

I think this is most likely going to be fixed by #21558.

@psmarshall
Copy link
Contributor

Yep, if the issue is not present in 10.11.0 then #21558 should fix it - it essentially backports all the CPU profiler changes that happened between V8 in Node 8 & 10. We did a lot of work on memory consumption of the CPU profiler so overall consumption should be much lower as well as fixing this leak 👍

@Flarna
Copy link
Member Author

Flarna commented Sep 25, 2018

I can't reproduce the leak I see in 8.12.0 with 10.11.0. I still see some increase in memory consumption for 10.11.0 but by far less then with 8.12.0. I have to take a closer look first to find the cause - it could be something completely unrelated.

I tried also Node 10.9.0 today and expected behavior like in 8.12.0, but actually 10.9.0 doesn't seem to leak. The Node 10 backport PR has only 4 commits whereas that one for node 8 has 17 so maybe it has been fixed in Node 10 already before.

@Flarna
Copy link
Member Author

Flarna commented Sep 25, 2018

Stopping profiling doesn't release the memory.

Seems this is not correct. Stopping the profiler actually releases the memory but only if all profiling sessions are stopped. Usually we first start the next session before stopping the current one to avoid the overhead of shutting down and starting up profiler.

@psmarshall
Copy link
Contributor

Yep those fixes were in V8 already so they didn't need to be backported. The CPU profiler will increase memory usage in general - we need to copy a bunch of data out of the heap so that it can be used off-thread to process the samples. We reduced the additional memory overhead caused by the CPU profiler by about 75% recently - that's what we are backporting to Node 8 and what you hopefully already see in Node 10.

@Flarna
Copy link
Member Author

Flarna commented Sep 25, 2018

ok thanks. Should the extra memory consumption in CpuProfiler saturate at some point? In my current sample I see an increase of a few MBytes/hour.

@psmarshall
Copy link
Contributor

I think the CPU profiler memory usage should be somewhat proportional to V8 heap size. We store metadata for each function/code object, so if the V8 heap is still growing then this metadata will grow too. Can you still see that increase with 10.11.0?

@Flarna
Copy link
Member Author

Flarna commented Sep 25, 2018

Yes, I still see an increase with 10.11.0 but significant lower then with 8.12.0. I just started a test session with heap profiler attached to get more insight. Unfortunately my testsetup has several other dependencies so it will take a while to get concrete data.

@Flarna
Copy link
Member Author

Flarna commented Sep 25, 2018

I let my app running for a few hours using 10.11.0 and memory consumption increased continuously. For some reason the rate increased after around 3.5h from around 10MByte/hour to ~20 MByte/h. In sum there are ~150MByte inside ProfilerListener.
VM heap size reported is 76MByte, 38MByte used (stable)

Callstack:

v8::internal::Runtime_StackGuard
v8::internal::StackGuard::HandleInterrupts
v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions
v8::internal::`anonymous namespace'::FinalizeOptimizedCompilationJob
v8::internal::OptimizedCompilationJob::RecordFunctionCompilation
v8::internal::`anonymous namespace'::LogFunctionCompilation
v8::internal::CodeEventDispatcher::CodeCreateEvent
v8::internal::ProfilerListener::CodeCreateEvent
v8::internal::ProfilerListener::RecordInliningInfo

I did also a test to use the CpuProfiler in a different way which seems to avoid the issue.

Problematic sequence:

  • allocate CpuProfiler
  • StartProfiling("sample-1")
  • after 1 minute StartProfiling("sample-2") and StopProfiling("sample-1")
  • after 1 minute StartProfiling("sample-1") and StopProfiling("sample-2")
  • ...

Working sequence:

  • allocate CpuProfiler
  • StartProfiling("sample")
  • after 1 minute StopProfiling("sample") and StartProfiling("sample")
  • after 1 minute StopProfiling("sample") and StartProfiling("sample")
  • ...

The disadvantage of the working sequence is that we found that StopProfiling has to sync (blocking in V8 main thread!) with the sampling thread which takes up to the sampling interval and quite some stuff is shutdown just to be created again in next Start call.
Using the overlapping approach avoided this.

Is the overlapping approach correct/allowed/supported?
Any other variant we could use for "continuous" sampling?

@Flarna Flarna changed the title Memory Leak in V8 CpuProfiler High Memory consumption in V8 CpuProfiler Sep 25, 2018
@Flarna
Copy link
Member Author

Flarna commented Sep 25, 2018

@nodejs/diagnostics fyi as CpuProfiler is intended to be a Tier 1 diagnostics tool

@psmarshall
Copy link
Contributor

Is the overlapping approach correct/allowed/supported?
This sounds OK to me. But it depends where the leak is coming from - possibly there is some conceptual issue with the design that causes this, or maybe just a mistake.

I'm trying to reproduce the leak with a small repro. Are you able to cut down the amount of actual server code you are running and still reproduce the issue? Any help you can give with a smaller/shareable repro would be really useful.

One other thing - are you calling Delete() on the profiles returned by StopProfiling()? I definitely see a leak if that is not done. I guess you are doing that, otherwise the leak would happen with non-overlapping profiles, too.

@Flarna
Copy link
Member Author

Flarna commented Oct 1, 2018

I was out the last day; will try today to get a reproducer I can share. I tried already with a small sample but failed. Seems the amount of code in use matters.

@Flarna
Copy link
Member Author

Flarna commented Oct 1, 2018

@psmarshall I created a private repo with a stripped down version of the sample I used and added you as collaborator. Please let me know if this is a working setup for your. Thanks!

@psmarshall
Copy link
Contributor

Thanks, I'll take a look.

@psmarshall
Copy link
Contributor

I've logged an upstream bug on the V8 tracker: https://bugs.chromium.org/p/v8/issues/detail?id=8253.

The root cause is explained there. As a workaround, deleting all profiles before starting a new profile will fix the leak - i.e. in the repro code, call .delete() on the profile returned by stopProfiling() before calling startProfiling().

@Flarna
Copy link
Member Author

Flarna commented Oct 2, 2018

@psmarshall Thanks a lot!

Could some collaborator please update the labels as it turned out that this is not just effecting v8.x. I updated already the top message.

@Flarna Flarna changed the title High Memory consumption in V8 CpuProfiler Memory Leak in V8 CpuProfiler Oct 2, 2018
@psmarshall
Copy link
Contributor

Thanks for the repro 👍 , it really helped me track it down. Yep, this affects v10.x and master. I'll land the fix on V8 master and then hopefully backport it to v8.x and 10.x.

@mmarchini mmarchini added confirmed-bug Issues with confirmed bugs. and removed v8.x labels Oct 2, 2018
@Flarna
Copy link
Member Author

Flarna commented Oct 4, 2018

I used the stopProfiling() - .delete() - startProfiling() variant now which should avoid the issue reported above but in one of my other samples I still see high memory consumption.

It seems quite a lot allocations happen during startProfiling(). It looks like the amount allocate depends on the V8 heap size - if GC runs before startProfiling() I see less allocations.
I'm also not 100% sure if all this memory is really kept across further profiling sessions but I see that RSS stays high even after GC causes that V8 heap drops from ~320MB to ~50MB. It goes down by the V8 heap diff but it stays much higher compared to never profile or start profiling after this V8 heap drop.

@psmarshall Based on your comments above that some data from heap needs to be copied I'm not sure if this is expected - in special that RSS stays high.

This happens with 8.12.0 and 10.11.0 but the amount of memory allocated if by far higher for 8.12.0 (I assume because of fixes already in 10.x and pending in 8.x).

Call Stack:

v8::internal::ProfilerListener::RecordInliningInfo(v8::internal::CodeEntry * entry, v8::internal::AbstractCode * abstract_code)
v8::internal::ProfilerListener::CodeCreateEvent(v8::internal::CodeEventListener::LogEventsAndTags tag, v8::internal::AbstractCode * abstract_code, v8::internal::SharedFunctionInfo * shared, v8::internal::Name * script_name, int line, int column)
v8::internal::CodeEventDispatcher::CodeCreateEvent(v8::internal::CodeEventListener::LogEventsAndTags tag, v8::internal::AbstractCode * code, v8::internal::SharedFunctionInfo * shared, v8::internal::Name * source, int line, int column)
v8::internal::ExistingCodeLogger::LogExistingFunction(v8::internal::Handle<v8::internal::SharedFunctionInfo> shared, v8::internal::Handle<v8::internal::AbstractCode> code, v8::internal::CodeEventListener::LogEventsAndTags tag)
v8::internal::ExistingCodeLogger::LogCompiledFunctions()
v8::internal::CpuProfiler::StartProcessorIfNotStarted()
v8::internal::CpuProfiler::StartProfiling(const char * title, bool record_samples, v8::CpuProfilingMode mode)
v8::internal::CpuProfiler::StartProfiling(v8::internal::String * title, bool record_samples, v8::CpuProfilingMode mode)

Edit: the "sticky" RSS seems to happen only in Linux, on Windows it goes down. So I assume this is caused by different implementations of the heap on these two OSes.

@psmarshall
Copy link
Contributor

@Flarna Interesting. Are you able to put together a basic repro the shows the behaviour? Then I'll log a v8 bug and look into it. I'll be travelling for the next week so probably after that.

@Flarna
Copy link
Member Author

Flarna commented Oct 8, 2018

Hmm, I was sure that I posted a reproducer last week but it seems something got wrong. I'm currently on sick leave with no access to it so it will take a bit.

@Flarna
Copy link
Member Author

Flarna commented Oct 12, 2018

I was able to create a small reproducer which shows this high RSS peak during CpuProfiler start and sticky RSS in Linux.
Please note that even this sample looks weird it's based on my assumption that the cause is the high number of functions on JS heap and my initial work was done on a (non shareable) real world application having >200000 functions on heap.

Run the following with node --expose_gc <script>:

"use strict"

if (!global.gc) {
  console.log("global.gc is not exposed!");
  process.exit(-1);
}

// ===================================================
const profiler = require("v8-profiler-node8");

const cProfInterval = 1000 * 5;
const cProfileName = "My-Profile";

profiler.setSamplingInterval(5000);
profiler.startProfiling(cProfileName, false);

setInterval(() => {
  console.log("restart profiling");
  const p = profiler.stopProfiling(cProfileName);
  p.delete(cProfileName);
  profiler.startProfiling(cProfileName, false);
}, cProfInterval);

// ===================================================
let a = [];
for (let i = 0; i < 1000000; i++) {
  const f = new Function("a", "b", "return a + b");
  f(i, i - 1);
  a.push(f);
}
console.log("function creation done");

setTimeout(() => {
  console.log("gc unused functions...");
  a = undefined;
  global.gc();
}, cProfInterval * 2.5);

setTimeout(() => process.exit(), cProfInterval * 5);

setInterval(() => {
  const bToMb = 1024 * 1024;
  const m = process.memoryUsage();
  console.log(`RSS: ${m.rss / bToMb}, heapTotal: ${m.heapTotal / bToMb}`);
}, 1000);

Results for process.memoryUsage().rss on my system:
Windows 8.12.0: 140MB => 1.1GB => 41MB
Windows 10.12.0: 132MB => 342MB => 27MB
Linux 8.12.0: 148MB => 1.1GB => 1GB
Linux 10.12.0: 147MB => 405MB => 301MB

It can be seen that the shrink of V8 Heap results also on RSS shrink for windows and linux but the RSS high peak is sticky on linux. I assume this is because V8 heap allocates/releases full pages wheres that other allocations are done via new on C heap and for some reason linux is not able to shrink this - maybe fragmentation. it stays high even if I put memory pressure on my machine till it has to swap.

To me it looks like CpuProfiler allocates a lot in v8::internal::ProfilerListener::RecordInliningInfo() and releases most of it in the profiler thread. Maybe the transfer of the needed function info into profiler thread could be improved.

@psmarshall
Copy link
Contributor

@Flarna Thanks I will take a look

@psmarshall
Copy link
Contributor

Upstream v8 issue: https://bugs.chromium.org/p/v8/issues/detail?id=8364

@Flarna
Copy link
Member Author

Flarna commented Dec 13, 2018

Some update here from chromium tracker:

Edit: Update with backport to NodeJs v8.

@psmarshall
Copy link
Contributor

Thanks for reminding me, here is the backport for Node 8: #25041

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. memory Issues and PRs related to the memory management or memory footprint. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

No branches or pull requests

5 participants