Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootstrap: implement run-time user-land snapshots via --build-snapshot and --snapshot-blob #38905

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
76 changes: 76 additions & 0 deletions doc/api/cli.md
Expand Up @@ -100,6 +100,62 @@ If this flag is passed, the behavior can still be set to not abort through
[`process.setUncaughtExceptionCaptureCallback()`][] (and through usage of the
`node:domain` module that uses it).

### `--build-snapshot`
joyeecheung marked this conversation as resolved.
Show resolved Hide resolved

<!-- YAML
added: REPLACEME
-->

> Stability: 1 - Experimental

Generates a snapshot blob when the process exits and writes it to
disk, which can be loaded later with `--snapshot-blob`.

When building the snapshot, if `--snapshot-blob` is not specified,
the generated blob will be written, by default, to `snapshot.blob`
in the current working directory. Otherwise it will be written to
the path specified by `--snapshot-blob`.

```console
$ echo "globalThis.foo = 'I am from the snapshot'" > snapshot.js

# Run snapshot.js to intialize the application and snapshot the
# state of it into snapshot.blob.
$ node --snapshot-blob snapshot.blob --build-snapshot snapshot.js

$ echo "console.log(globalThis.foo)" > index.js

# Load the generated snapshot and start the application from index.js.
$ node --snapshot-blob snapshot.blob index.js
I am from the snapshot
```

The [`v8.startupSnapshot` API][] can be used to specify an entry point at
snapshot building time, thus avoiding the need of an additional entry
script at deserialization time:

```console
$ echo "require('v8').startupSnapshot.setDeserializeMainFunction(() => console.log('I am from the snapshot'))" > snapshot.js
$ node --snapshot-blob snapshot.blob --build-snapshot snapshot.js
$ node --snapshot-blob snapshot.blob
I am from the snapshot
```

For more information, check out the [`v8.startupSnapshot` API][] documentation.

Currently the support for run-time snapshot is experimental in that:

1. User-land modules are not yet supported in the snapshot, so only
one single file can be snapshotted. Users can bundle their applications
into a single script with their bundler of choice before building
a snapshot, however.
2. Only a subset of the built-in modules work in the snapshot, though the
Node.js core test suite checks that a few fairly complex applications
can be snapshotted. Support for more modules are being added. If any
crashes or buggy behaviors occur when building a snapshot, please file
a report in the [Node.js issue tracker][] and link to it in the
[tracking issue for user-land snapshots][].

### `--completion-bash`

<!-- YAML
Expand Down Expand Up @@ -1121,6 +1177,22 @@ minimum allocation from the secure heap. The minimum value is `2`.
The maximum value is the lesser of `--secure-heap` or `2147483647`.
The value given must be a power of two.

### `--snapshot-blob=path`
legendecas marked this conversation as resolved.
Show resolved Hide resolved

<!-- YAML
added: REPLACEME
-->

> Stability: 1 - Experimental

When used with `--build-snapshot`, `--snapshot-blob` specifies the path
where the generated snapshot blob will be written to. If not specified,
the generated blob will be written, by default, to `snapshot.blob`
in the current working directory.

When used without `--build-snapshot`, `--snapshot-blob` specifies the
path to the blob that will be used to restore the application state.

### `--test`

<!-- YAML
Expand Down Expand Up @@ -1735,6 +1807,7 @@ Node.js options that are allowed are:
* `--require`, `-r`
* `--secure-heap-min`
* `--secure-heap`
* `--snapshot-blob`
* `--test-only`
* `--throw-deprecation`
* `--title`
Expand Down Expand Up @@ -2109,6 +2182,7 @@ done
[ECMAScript module loader]: esm.md#loaders
[Fetch API]: https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API
[Modules loaders]: packages.md#modules-loaders
[Node.js issue tracker]: https://github.com/nodejs/node/issues
[OSSL_PROVIDER-legacy]: https://www.openssl.org/docs/man3.0/man7/OSSL_PROVIDER-legacy.html
[REPL]: repl.md
[ScriptCoverage]: https://chromedevtools.github.io/devtools-protocol/tot/Profiler#type-ScriptCoverage
Expand Down Expand Up @@ -2141,6 +2215,7 @@ done
[`tls.DEFAULT_MAX_VERSION`]: tls.md#tlsdefault_max_version
[`tls.DEFAULT_MIN_VERSION`]: tls.md#tlsdefault_min_version
[`unhandledRejection`]: process.md#event-unhandledrejection
[`v8.startupSnapshot` API]: v8.md#startup-snapshot-api
[`worker_threads.threadId`]: worker_threads.md#workerthreadid
[conditional exports]: packages.md#conditional-exports
[context-aware]: addons.md#context-aware-addons
Expand All @@ -2156,4 +2231,5 @@ done
[security warning]: #warning-binding-inspector-to-a-public-ipport-combination-is-insecure
[semi-space]: https://www.memorymanagement.org/glossary/s.html#semi.space
[timezone IDs]: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
[tracking issue for user-land snapshots]: https://github.com/nodejs/node/issues/44014
[ways that `TZ` is handled in other environments]: https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html
30 changes: 27 additions & 3 deletions lib/internal/bootstrap/pre_execution.js
Expand Up @@ -28,6 +28,11 @@ const {
} = require('internal/errors').codes;
const assert = require('internal/assert');

const {
addSerializeCallback,
isBuildingSnapshot,
} = require('v8').startupSnapshot;

function prepareMainThreadExecution(expandArgv1 = false,
initialzeModules = true) {
joyeecheung marked this conversation as resolved.
Show resolved Hide resolved
refreshRuntimeOptions();
Expand Down Expand Up @@ -169,11 +174,21 @@ function addReadOnlyProcessAlias(name, option, enumerable = true) {

function setupWarningHandler() {
const {
onWarning
onWarning,
resetForSerialization
} = require('internal/process/warning');
if (getOptionValue('--warnings') &&
process.env.NODE_NO_WARNINGS !== '1') {
process.on('warning', onWarning);

// The code above would add the listener back during deserialization,
// if applicable.
if (isBuildingSnapshot()) {
addSerializeCallback(() => {
process.removeListener('warning', onWarning);
resetForSerialization();
});
}
}
}

Expand Down Expand Up @@ -327,9 +342,18 @@ function initializeHeapSnapshotSignalHandlers() {
require('internal/validators').validateSignalName(signal);
const { writeHeapSnapshot } = require('v8');

process.on(signal, () => {
function doWriteHeapSnapshot() {
writeHeapSnapshot();
});
}
process.on(signal, doWriteHeapSnapshot);

// The code above would add the listener back during deserialization,
// if applicable.
if (isBuildingSnapshot()) {
addSerializeCallback(() => {
process.removeListener(signal, doWriteHeapSnapshot);
legendecas marked this conversation as resolved.
Show resolved Hide resolved
});
}
}

function setupTraceCategoryState() {
Expand Down
28 changes: 20 additions & 8 deletions lib/internal/process/warning.js
Expand Up @@ -22,6 +22,16 @@ let fs;
let fd;
let warningFile;
let options;
let traceWarningHelperShown = false;

function resetForSerialization() {
if (fd !== undefined) {
process.removeListener('exit', closeFdOnExit);
}
fd = undefined;
warningFile = undefined;
traceWarningHelperShown = false;
}

function lazyOption() {
// This will load `warningFile` only once. If the flag is not set,
Expand Down Expand Up @@ -50,6 +60,14 @@ function writeOut(message) {
error(message);
}

function closeFdOnExit() {
try {
fs.closeSync(fd);
} catch {
// Continue regardless of error.
}
}

function writeToFile(message) {
if (fd === undefined) {
fs = require('fs');
Expand All @@ -58,13 +76,7 @@ function writeToFile(message) {
} catch {
return writeOut(message);
}
process.on('exit', () => {
try {
fs.closeSync(fd);
} catch {
// Continue regardless of error.
}
});
process.on('exit', closeFdOnExit);
}
fs.appendFile(fd, `${message}\n`, (err) => {
if (err) {
Expand All @@ -77,7 +89,6 @@ function doEmitWarning(warning) {
process.emit('warning', warning);
}

let traceWarningHelperShown = false;
function onWarning(warning) {
if (!(warning instanceof Error)) return;
const isDeprecation = warning.name === 'DeprecationWarning';
Expand Down Expand Up @@ -179,4 +190,5 @@ module.exports = {
emitWarning,
emitWarningSync,
onWarning,
resetForSerialization,
};
19 changes: 16 additions & 3 deletions lib/net.js
Expand Up @@ -131,9 +131,20 @@ const noop = () => {};

const kPerfHooksNetConnectContext = Symbol('kPerfHooksNetConnectContext');

const dc = require('diagnostics_channel');
const netClientSocketChannel = dc.channel('net.client.socket');
const netServerSocketChannel = dc.channel('net.server.socket');
let netClientSocketChannel;
let netServerSocketChannel;
function lazyChannels() {
// TODO(joyeecheung): support diagnostics channels in the snapshot.
// For now it is fine to create them lazily when there isn't a snapshot to
// build. If users need the channels they would have to create them first
// before invoking any built-ins that would publish to these channels
// anyway.
if (netClientSocketChannel === undefined) {
const dc = require('diagnostics_channel');
netClientSocketChannel = dc.channel('net.client.socket');
netServerSocketChannel = dc.channel('net.server.socket');
}
}

const {
hasObserver,
Expand Down Expand Up @@ -206,6 +217,7 @@ function connect(...args) {
const options = normalized[0];
debug('createConnection', normalized);
const socket = new Socket(options);
lazyChannels();
if (netClientSocketChannel.hasSubscribers) {
netClientSocketChannel.publish({
socket,
Expand Down Expand Up @@ -1739,6 +1751,7 @@ function onconnection(err, clientHandle) {
socket.server = self;
socket._server = self;
self.emit('connection', socket);
lazyChannels();
if (netServerSocketChannel.hasSubscribers) {
netServerSocketChannel.publish({
socket,
Expand Down
36 changes: 21 additions & 15 deletions src/env.cc
Expand Up @@ -248,17 +248,6 @@ std::ostream& operator<<(std::ostream& output,
return output;
}

std::ostream& operator<<(std::ostream& output,
const std::vector<PropInfo>& vec) {
output << "{\n";
for (const auto& info : vec) {
output << " { \"" << info.name << "\", " << std::to_string(info.id) << ", "
<< std::to_string(info.index) << " },\n";
}
output << "}";
return output;
}

std::ostream& operator<<(std::ostream& output,
const IsolateDataSerializeInfo& i) {
output << "{\n"
Expand Down Expand Up @@ -298,7 +287,7 @@ IsolateDataSerializeInfo IsolateData::Serialize(SnapshotCreator* creator) {
for (size_t i = 0; i < AsyncWrap::PROVIDERS_LENGTH; i++)
info.primitive_values.push_back(creator->AddData(async_wrap_provider(i)));

size_t id = 0;
uint32_t id = 0;
#define V(PropertyName, TypeName) \
do { \
Local<TypeName> field = PropertyName(); \
Expand Down Expand Up @@ -352,7 +341,7 @@ void IsolateData::DeserializeProperties(const IsolateDataSerializeInfo* info) {

const std::vector<PropInfo>& values = info->template_values;
i = 0; // index to the array
size_t id = 0;
uint32_t id = 0;
#define V(PropertyName, TypeName) \
do { \
if (values.size() > i && id == values[i].id) { \
Expand Down Expand Up @@ -1485,6 +1474,7 @@ std::ostream& operator<<(std::ostream& output,
AsyncHooks::SerializeInfo AsyncHooks::Serialize(Local<Context> context,
SnapshotCreator* creator) {
SerializeInfo info;
// TODO(joyeecheung): some of these probably don't need to be serialized.
info.async_ids_stack = async_ids_stack_.Serialize(context, creator);
info.fields = fields_.Serialize(context, creator);
info.async_id_fields = async_id_fields_.Serialize(context, creator);
Expand Down Expand Up @@ -1679,7 +1669,7 @@ EnvSerializeInfo Environment::Serialize(SnapshotCreator* creator) {
info.should_abort_on_uncaught_toggle =
should_abort_on_uncaught_toggle_.Serialize(ctx, creator);

size_t id = 0;
uint32_t id = 0;
#define V(PropertyName, TypeName) \
do { \
Local<TypeName> field = PropertyName(); \
Expand All @@ -1696,6 +1686,22 @@ EnvSerializeInfo Environment::Serialize(SnapshotCreator* creator) {
return info;
}

std::ostream& operator<<(std::ostream& output,
const std::vector<PropInfo>& vec) {
output << "{\n";
for (const auto& info : vec) {
output << " " << info << ",\n";
}
output << "}";
return output;
}

std::ostream& operator<<(std::ostream& output, const PropInfo& info) {
output << "{ \"" << info.name << "\", " << std::to_string(info.id) << ", "
<< std::to_string(info.index) << " }";
return output;
}

std::ostream& operator<<(std::ostream& output,
const std::vector<std::string>& vec) {
output << "{\n";
Expand Down Expand Up @@ -1777,7 +1783,7 @@ void Environment::DeserializeProperties(const EnvSerializeInfo* info) {

const std::vector<PropInfo>& values = info->persistent_values;
size_t i = 0; // index to the array
size_t id = 0;
uint32_t id = 0;
#define V(PropertyName, TypeName) \
do { \
if (values.size() > i && id == values[i].id) { \
Expand Down
11 changes: 8 additions & 3 deletions src/env.h
Expand Up @@ -580,7 +580,7 @@ typedef size_t SnapshotIndex;

struct PropInfo {
std::string name; // name for debugging
size_t id; // In the list - in case there are any empty entries
uint32_t id; // In the list - in case there are any empty entries
SnapshotIndex index; // In the snapshot
};

Expand Down Expand Up @@ -987,8 +987,9 @@ struct EnvSerializeInfo {
struct SnapshotData {
enum class DataOwnership { kOwned, kNotOwned };

static const size_t kNodeBaseContextIndex = 0;
static const size_t kNodeMainContextIndex = kNodeBaseContextIndex + 1;
static const uint32_t kMagic = 0x143da19;
static const SnapshotIndex kNodeBaseContextIndex = 0;
static const SnapshotIndex kNodeMainContextIndex = kNodeBaseContextIndex + 1;

DataOwnership data_ownership = DataOwnership::kOwned;

Expand All @@ -1000,12 +1001,16 @@ struct SnapshotData {
// TODO(joyeecheung): there should be a vector of env_info once we snapshot
// the worker environments.
EnvSerializeInfo env_info;

// A vector of built-in ids and v8::ScriptCompiler::CachedData, this can be
// shared across Node.js instances because they are supposed to share the
// read only space. We use native_module::CodeCacheInfo because
// v8::ScriptCompiler::CachedData is not copyable.
std::vector<native_module::CodeCacheInfo> code_cache;

void ToBlob(FILE* out) const;
static void FromBlob(SnapshotData* out, FILE* in);

~SnapshotData();

SnapshotData(const SnapshotData&) = delete;
Expand Down