Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
src: add --heapsnapshot-near-heap-limit option
This patch adds a --heapsnapshot-near-heap-limit CLI option
that takes heap snapshots when the V8 heap is approaching
the heap size limit. It will try to write the snapshots
to disk before the program crashes due to OOM.

PR-URL: #33010
Refs: #27552
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
  • Loading branch information
joyeecheung authored and MylesBorins committed Aug 31, 2021
1 parent 6544cfb commit 6d06ac2
Show file tree
Hide file tree
Showing 20 changed files with 496 additions and 21 deletions.
47 changes: 47 additions & 0 deletions doc/api/cli.md
Expand Up @@ -337,6 +337,52 @@ reference. Code may break under this flag.
`--require` runs prior to freezing intrinsics in order to allow polyfills to
be added.

### `--heapsnapshot-near-heap-limit=max_count`
<!-- YAML
added: REPLACEME
-->

> Stability: 1 - Experimental
Writes a V8 heap snapshot to disk when the V8 heap usage is approaching the
heap limit. `count` should be a non-negative integer (in which case
Node.js will write no more than `max_count` snapshots to disk).

When generating snapshots, garbage collection may be triggered and bring
the heap usage down, therefore multiple snapshots may be written to disk
before the Node.js instance finally runs out of memory. These heap snapshots
can be compared to determine what objects are being allocated during the
time consecutive snapshots are taken. It's not guaranteed that Node.js will
write exactly `max_count` snapshots to disk, but it will try
its best to generate at least one and up to `max_count` snapshots before the
Node.js instance runs out of memory when `max_count` is greater than `0`.

Generating V8 snapshots takes time and memory (both memory managed by the
V8 heap and native memory outside the V8 heap). The bigger the heap is,
the more resources it needs. Node.js will adjust the V8 heap to accommondate
the additional V8 heap memory overhead, and try its best to avoid using up
all the memory avialable to the process. When the process uses
more memory than the system deems appropriate, the process may be terminated
abruptly by the system, depending on the system configuration.

```console
$ node --max-old-space-size=100 --heapsnapshot-near-heap-limit=3 index.js
Wrote snapshot to Heap.20200430.100036.49580.0.001.heapsnapshot
Wrote snapshot to Heap.20200430.100037.49580.0.002.heapsnapshot
Wrote snapshot to Heap.20200430.100038.49580.0.003.heapsnapshot

<--- Last few GCs --->

[49580:0x110000000] 4826 ms: Mark-sweep 130.6 (147.8) -> 130.5 (147.8) MB, 27.4 / 0.0 ms (average mu = 0.126, current mu = 0.034) allocation failure scavenge might not succeed
[49580:0x110000000] 4845 ms: Mark-sweep 130.6 (147.8) -> 130.6 (147.8) MB, 18.8 / 0.0 ms (average mu = 0.088, current mu = 0.031) allocation failure scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
....
```

### `--heapsnapshot-signal=signal`
<!-- YAML
added: v12.0.0
Expand Down Expand Up @@ -1285,6 +1331,7 @@ Node.js options that are allowed are:
* `--force-context-aware`
* `--force-fips`
* `--frozen-intrinsics`
* `--heapsnapshot-near-heap-limit`
* `--heapsnapshot-signal`
* `--http-parser`
* `--icu-data-dir`
Expand Down
4 changes: 4 additions & 0 deletions doc/node.1
Expand Up @@ -185,6 +185,10 @@ Same requirements as
.It Fl -frozen-intrinsics
Enable experimental frozen intrinsics support.
.
.It Fl -heapsnapshot-near-heap-limit Ns = Ns Ar max_count
Generate heap snapshot when the V8 heap usage is approaching the heap limit.
No more than the specified number of snapshots will be generated.
.
.It Fl -heapsnapshot-signal Ns = Ns Ar signal
Generate heap snapshot on specified signal.
.
Expand Down
1 change: 1 addition & 0 deletions src/debug_utils.h
Expand Up @@ -41,6 +41,7 @@ void FWrite(FILE* file, const std::string& str);
// from a provider type to a debug category.
#define DEBUG_CATEGORY_NAMES(V) \
NODE_ASYNC_PROVIDER_TYPES(V) \
V(DIAGNOSTICS) \
V(HUGEPAGES) \
V(INSPECTOR_SERVER) \
V(INSPECTOR_PROFILER) \
Expand Down
16 changes: 16 additions & 0 deletions src/env-inl.h
Expand Up @@ -704,6 +704,22 @@ inline const std::string& Environment::exec_path() const {
return exec_path_;
}

inline std::string Environment::GetCwd() {
char cwd[PATH_MAX_BYTES];
size_t size = PATH_MAX_BYTES;
const int err = uv_cwd(cwd, &size);

if (err == 0) {
CHECK_GT(size, 0);
return cwd;
}

// This can fail if the cwd is deleted. In that case, fall back to
// exec_path.
const std::string& exec_path = exec_path_;
return exec_path.substr(0, exec_path.find_last_of(kPathSeparator));
}

#if HAVE_INSPECTOR
inline void Environment::set_coverage_directory(const char* dir) {
coverage_directory_ = std::string(dir);
Expand Down
146 changes: 146 additions & 0 deletions src/env.cc
Expand Up @@ -3,6 +3,7 @@
#include "async_wrap.h"
#include "base_object-inl.h"
#include "debug_utils-inl.h"
#include "diagnosticfilename-inl.h"
#include "memory_tracker-inl.h"
#include "node_buffer.h"
#include "node_context_data.h"
Expand All @@ -22,6 +23,7 @@
#include <algorithm>
#include <atomic>
#include <cstdio>
#include <limits>
#include <memory>

namespace node {
Expand Down Expand Up @@ -465,6 +467,11 @@ Environment::~Environment() {
// FreeEnvironment() should have set this.
CHECK(is_stopping());

if (options_->heap_snapshot_near_heap_limit > heap_limit_snapshot_taken_) {
isolate_->RemoveNearHeapLimitCallback(Environment::NearHeapLimitCallback,
0);
}

isolate()->GetHeapProfiler()->RemoveBuildEmbedderGraphCallback(
BuildEmbedderGraph, this);

Expand Down Expand Up @@ -1097,6 +1104,25 @@ void Environment::VerifyNoStrongBaseObjects() {
});
}

uint64_t GuessMemoryAvailableToTheProcess() {
uint64_t free_in_system = uv_get_free_memory();
size_t allowed = uv_get_constrained_memory();
if (allowed == 0) {
return free_in_system;
}
size_t rss;
int err = uv_resident_set_memory(&rss);
if (err) {
return free_in_system;
}
if (allowed < rss) {
// Something is probably wrong. Fallback to the free memory.
return free_in_system;
}
// There may still be room for swap, but we will just leave it here.
return allowed - rss;
}

void Environment::BuildEmbedderGraph(Isolate* isolate,
EmbedderGraph* graph,
void* data) {
Expand All @@ -1109,6 +1135,126 @@ void Environment::BuildEmbedderGraph(Isolate* isolate,
});
}

size_t Environment::NearHeapLimitCallback(void* data,
size_t current_heap_limit,
size_t initial_heap_limit) {
Environment* env = static_cast<Environment*>(data);

Debug(env,
DebugCategory::DIAGNOSTICS,
"Invoked NearHeapLimitCallback, processing=%d, "
"current_limit=%" PRIu64 ", "
"initial_limit=%" PRIu64 "\n",
env->is_processing_heap_limit_callback_,
static_cast<uint64_t>(current_heap_limit),
static_cast<uint64_t>(initial_heap_limit));

size_t max_young_gen_size = env->isolate_data()->max_young_gen_size;
size_t young_gen_size = 0;
size_t old_gen_size = 0;

v8::HeapSpaceStatistics stats;
size_t num_heap_spaces = env->isolate()->NumberOfHeapSpaces();
for (size_t i = 0; i < num_heap_spaces; ++i) {
env->isolate()->GetHeapSpaceStatistics(&stats, i);
if (strcmp(stats.space_name(), "new_space") == 0 ||
strcmp(stats.space_name(), "new_large_object_space") == 0) {
young_gen_size += stats.space_used_size();
} else {
old_gen_size += stats.space_used_size();
}
}

Debug(env,
DebugCategory::DIAGNOSTICS,
"max_young_gen_size=%" PRIu64 ", "
"young_gen_size=%" PRIu64 ", "
"old_gen_size=%" PRIu64 ", "
"total_size=%" PRIu64 "\n",
static_cast<uint64_t>(max_young_gen_size),
static_cast<uint64_t>(young_gen_size),
static_cast<uint64_t>(old_gen_size),
static_cast<uint64_t>(young_gen_size + old_gen_size));

uint64_t available = GuessMemoryAvailableToTheProcess();
// TODO(joyeecheung): get a better estimate about the native memory
// usage into the overhead, e.g. based on the count of objects.
uint64_t estimated_overhead = max_young_gen_size;
Debug(env,
DebugCategory::DIAGNOSTICS,
"Estimated available memory=%" PRIu64 ", "
"estimated overhead=%" PRIu64 "\n",
static_cast<uint64_t>(available),
static_cast<uint64_t>(estimated_overhead));

// This might be hit when the snapshot is being taken in another
// NearHeapLimitCallback invocation.
// When taking the snapshot, objects in the young generation may be
// promoted to the old generation, result in increased heap usage,
// but it should be no more than the young generation size.
// Ideally, this should be as small as possible - the heap limit
// can only be restored when the heap usage falls down below the
// new limit, so in a heap with unbounded growth the isolate
// may eventually crash with this new limit - effectively raising
// the heap limit to the new one.
if (env->is_processing_heap_limit_callback_) {
size_t new_limit = initial_heap_limit + max_young_gen_size;
Debug(env,
DebugCategory::DIAGNOSTICS,
"Not generating snapshots in nested callback. "
"new_limit=%" PRIu64 "\n",
static_cast<uint64_t>(new_limit));
return new_limit;
}

// Estimate whether the snapshot is going to use up all the memory
// available to the process. If so, just give up to prevent the system
// from killing the process for a system OOM.
if (estimated_overhead > available) {
Debug(env,
DebugCategory::DIAGNOSTICS,
"Not generating snapshots because it's too risky.\n");
env->isolate()->RemoveNearHeapLimitCallback(NearHeapLimitCallback,
initial_heap_limit);
return current_heap_limit;
}

// Take the snapshot synchronously.
env->is_processing_heap_limit_callback_ = true;

std::string dir = env->options()->diagnostic_dir;
if (dir.empty()) {
dir = env->GetCwd();
}
DiagnosticFilename name(env, "Heap", "heapsnapshot");
std::string filename = dir + kPathSeparator + (*name);

Debug(env, DebugCategory::DIAGNOSTICS, "Start generating %s...\n", *name);

// Remove the callback first in case it's triggered when generating
// the snapshot.
env->isolate()->RemoveNearHeapLimitCallback(NearHeapLimitCallback,
initial_heap_limit);

heap::WriteSnapshot(env->isolate(), filename.c_str());
env->heap_limit_snapshot_taken_ += 1;

// Don't take more snapshots than the number specified by
// --heapsnapshot-near-heap-limit.
if (env->heap_limit_snapshot_taken_ <
env->options_->heap_snapshot_near_heap_limit) {
env->isolate()->AddNearHeapLimitCallback(NearHeapLimitCallback, env);
}

FPrintF(stderr, "Wrote snapshot to %s\n", filename.c_str());
// Tell V8 to reset the heap limit once the heap usage falls down to
// 95% of the initial limit.
env->isolate()->AutomaticallyRestoreInitialHeapLimit(0.95);

env->is_processing_heap_limit_callback_ = false;
return initial_heap_limit;
}

inline size_t Environment::SelfSize() const {
size_t size = sizeof(*this);
// Remove non pointer fields that will be tracked in MemoryInfo()
Expand Down
10 changes: 10 additions & 0 deletions src/env.h
Expand Up @@ -537,6 +537,7 @@ class IsolateData : public MemoryRetainer {
#undef VP
inline v8::Local<v8::String> async_wrap_provider(int index) const;

size_t max_young_gen_size = 1;
std::unordered_map<const char*, v8::Eternal<v8::String>> static_str_map;

inline v8::Isolate* isolate() const;
Expand Down Expand Up @@ -857,6 +858,9 @@ class Environment : public MemoryRetainer {
void VerifyNoStrongBaseObjects();
// Should be called before InitializeInspector()
void InitializeDiagnostics();

std::string GetCwd();

#if HAVE_INSPECTOR
// If the environment is created for a worker, pass parent_handle and
// the ownership if transferred into the Environment.
Expand Down Expand Up @@ -1220,6 +1224,9 @@ class Environment : public MemoryRetainer {
inline void RemoveCleanupHook(CleanupCallback cb, void* arg);
void RunCleanup();

static size_t NearHeapLimitCallback(void* data,
size_t current_heap_limit,
size_t initial_heap_limit);
static void BuildEmbedderGraph(v8::Isolate* isolate,
v8::EmbedderGraph* graph,
void* data);
Expand Down Expand Up @@ -1340,6 +1347,9 @@ class Environment : public MemoryRetainer {
std::vector<std::string> argv_;
std::string exec_path_;

bool is_processing_heap_limit_callback_ = false;
int64_t heap_limit_snapshot_taken_ = 0;

uint32_t module_id_counter_ = 0;
uint32_t script_id_counter_ = 0;
uint32_t function_id_counter_ = 0;
Expand Down
6 changes: 3 additions & 3 deletions src/heap_utils.cc
Expand Up @@ -313,7 +313,9 @@ inline void TakeSnapshot(Isolate* isolate, v8::OutputStream* out) {
snapshot->Serialize(out, HeapSnapshot::kJSON);
}

inline bool WriteSnapshot(Isolate* isolate, const char* filename) {
} // namespace

bool WriteSnapshot(Isolate* isolate, const char* filename) {
FILE* fp = fopen(filename, "w");
if (fp == nullptr)
return false;
Expand All @@ -323,8 +325,6 @@ inline bool WriteSnapshot(Isolate* isolate, const char* filename) {
return true;
}

} // namespace

void DeleteHeapSnapshot(const HeapSnapshot* snapshot) {
const_cast<HeapSnapshot*>(snapshot)->Delete();
}
Expand Down
20 changes: 2 additions & 18 deletions src/inspector_profiler.cc
Expand Up @@ -418,22 +418,6 @@ static void EndStartedProfilers(Environment* env) {
}
}

std::string GetCwd(Environment* env) {
char cwd[PATH_MAX_BYTES];
size_t size = PATH_MAX_BYTES;
const int err = uv_cwd(cwd, &size);

if (err == 0) {
CHECK_GT(size, 0);
return cwd;
}

// This can fail if the cwd is deleted. In that case, fall back to
// exec_path.
const std::string& exec_path = env->exec_path();
return exec_path.substr(0, exec_path.find_last_of(kPathSeparator));
}

void StartProfilers(Environment* env) {
AtExit(env, [](void* env) {
EndStartedProfilers(static_cast<Environment*>(env));
Expand All @@ -451,7 +435,7 @@ void StartProfilers(Environment* env) {
if (env->options()->cpu_prof) {
const std::string& dir = env->options()->cpu_prof_dir;
env->set_cpu_prof_interval(env->options()->cpu_prof_interval);
env->set_cpu_prof_dir(dir.empty() ? GetCwd(env) : dir);
env->set_cpu_prof_dir(dir.empty() ? env->GetCwd() : dir);
if (env->options()->cpu_prof_name.empty()) {
DiagnosticFilename filename(env, "CPU", "cpuprofile");
env->set_cpu_prof_name(*filename);
Expand All @@ -466,7 +450,7 @@ void StartProfilers(Environment* env) {
if (env->options()->heap_prof) {
const std::string& dir = env->options()->heap_prof_dir;
env->set_heap_prof_interval(env->options()->heap_prof_interval);
env->set_heap_prof_dir(dir.empty() ? GetCwd(env) : dir);
env->set_heap_prof_dir(dir.empty() ? env->GetCwd() : dir);
if (env->options()->heap_prof_name.empty()) {
DiagnosticFilename filename(env, "Heap", "heapprofile");
env->set_heap_prof_name(*filename);
Expand Down
4 changes: 4 additions & 0 deletions src/node.cc
Expand Up @@ -267,6 +267,10 @@ static void AtomicsWaitCallback(Isolate::AtomicsWaitEvent event,
void Environment::InitializeDiagnostics() {
isolate_->GetHeapProfiler()->AddBuildEmbedderGraphCallback(
Environment::BuildEmbedderGraph, this);
if (options_->heap_snapshot_near_heap_limit > 0) {
isolate_->AddNearHeapLimitCallback(Environment::NearHeapLimitCallback,
this);
}
if (options_->trace_uncaught)
isolate_->SetCaptureStackTraceForUncaughtExceptions(true);
if (options_->trace_atomics_wait) {
Expand Down

0 comments on commit 6d06ac2

Please sign in to comment.